Grow your data science skills at posit::conf(2024)

August 12th-14th in Seattle

Champions

Building a Business Case

Individual sitting at a desk working on a laptop with Posit product on screen and hand framed with computer terminal graphic

You might be

  • A new Data Science Manager who’s just starting to build your team.
  • An individual analyst who wants to be more efficient in their daily work.
  • Curious about which tools can help the most.
  • Starting a conversation about how to best use R/Python to meet your organization’s IT requirements for security, administration, and more.
  • One of the millions of people using open-source data science today.
  • If you’re starting from scratch, know that you’re not alone. So, where do you go from here?

 

 

 

Start the process of introducing data science tools

  • Consider the tools you have available now. Under what conditions it would make more sense for you to use R/Python and why? Think in terms of the return on investment as you start to build your business case.
  • If possible, develop prototype reports or apps to demonstrate what would be possible with the right tools.
  • Let leadership know you’re interested in acquiring new tools to make you more effective in your current role. Ask about the process for this. There could be extra budget available to your team for new tools that you didn’t know about!
  • Talk to the right people in IT (or through your manager). Figure out the process that your company uses to onboard software through official channels. If you’re part of a large organization, your IT department probably has a review board who makes decisions about new tools. Show IT that you take security seriously by highlighting the challenges with introducing new software.
  • If you get pushback about budget or resources, consider working together with another team across your business. If they’ve gotten funding before, they can help you come up with ideas. See the section on building community if you have yet to find others in your company.
  • Approach resistance with curiosity. Start a discussion. Ask good open-ended questions like: “What does our toolchain look like for next year?” “What commitment is necessary to use open source tools?”

How this worked at Brown-Forman

Main takeaways

  • Start off by doing things with free and open-source locally.
  • Once you do, install R Markdown, Shiny, and pull in local data. Show a Shiny application running locally from your own desktop, or make a flexdashboard and send that out as an html file to people so they can see what it looks like.
  • Start building your strategy. Make a slide deck that shows the current state diagram of how you do analytics and how people use them. Then show what you could do if you had some set of data science tools and include literal examples of that.
  • Communicate value by using a stair-step approach. Start small. Find a business need that you can solve better. People tend to come along for the ride when you’re interested in helping solve their problems.
  • Start a conversation with your immediate manager. Have them help you work up the chain to then start the conversation with IT.
  • While you’re helping people solve business problems, build examples you can include in your next presentation. Use this framework as an example:
    1. Look, this is why we should do this.
    2. I have been doing it.
    3. Here’s the feedback from that.
    4. Here’s the solution to make this a real thing – instead of something just living on my laptop.
  • If you face questions like “Why do we need this, we have a BI tool,” adjust your presentation to include a table of what each tool is good and not good at. Explain why you need both of them to solve this problem. (Additional info in the FAQ below.)

What if I'm being asked to use existing BI tools instead?

Both Business Intelligence (BI) and data science platforms share a common goal: delivering rich interactive applications and dashboards that can be shared with others to improve their decision-making. However, this common purpose often leads to the tools (and the teams that support and use them) being seen as competitors for software budgets and executive mindshare in a large organization.

BI tools are powerful and have a lower barrier to entry for most users, but have limits to their flexibility and analytic depth. This limits the complexity of the questions they can answer.
Open source data science has a higher barrier to entry, requiring coding skills for development. But its flexibility and analytic power is nearly limitless. This allows organizations to answer the most complex questions they have. Organizations must consider this balance, between the barrier to entry and the complexity of the questions that need to be answered, when choosing an approach. There are a number of Posit blog posts and a BI & Data Science Overview that address this in further detail as well.

What if they say no to open-source?

An incredible variety of the world’s computation runs on top of open source software. Open source means that the code for these programming languages is developed in public and is available for public review. This does not mean that these bits of code are ill-maintained or unloved.

R and Python have been around since the early 1990s and have millions of users every year. Both R and Python are complete programming languages that are able to do a wide array of complex statistical calculations, machine learning tasks, dashboarding and reporting, and more.

They form the foundation of data science practices for many different kinds of organizations that include governmental organizations, major pharmaceutical companies, banks and other financial institutions.

The reality is that most organizations are already supporting open-source software. The The 2021 State of Enterprise Open Source report survey shares that 90% of IT leaders (1,250 surveyed) are using enterprise open source today.

With regards to the license, Posit's professional products replace the AGPL open-source license with a commercial license.

What are some additional benefits of open-source that I can share?

By adopting an open source core, you can make it easier to recruit and retain data scientists. The comprehensive nature of open source ensures you will always have the right tool for any analytic problem, including the ability to connect to all your other analytic investments. You also avoid putting yourself at the mercy of any specific vendor, since your core data science work is based on R or Python.

Complex, sometimes vaguely-defined analytic problems require the power of code. Code is flexible, without any black box constraints, and enables you to access, transform and combine ALL of your data. Code enables fast iteration and updates in response to feedback, or new circumstances. And most importantly, by its very nature code is reusable, extensible and inspectable, allowing you to modify and apply it to new problems, and track where changes occurred. Code becomes a core source of intellectual property in your organization, the value of which grows over time.

It’s better for everyone if the tools used for research and science are free and open. Reproducibility, widespread sharing of knowledge and techniques, and the leveling of the playing field by eliminating cost barriers are but a few of the shared benefits of free software in science.

For some suggestions on how to position the value of open source, code-first data science, see our Serious Data Science page.

What is the difference between RStudio open-source and professional products?

Posit's mission is to make data science available to everyone, regardless of their economic means. Our professional products exist to scale, secure and deploy our open source products.

Posit Team is a modular platform of commercial software products (Posit Workbench, Posit Connect, and Posit Package Manager) which gives organizations the confidence to adopt R, Python and other open-source data science software at scale. Posit Team allows organizations to leverage large amounts of data, deploy applications securely,integrate with existing enterprise systems, platforms, and processes, and be compliant with security practices and standards.

Together, Posit's open-source software and commercial software form a virtuous cycle: The adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in the open-source software that benefits everyone.

Is Posit just for R?

Nope! Many of our customers' data science teams are bilingual, leveraging both R and Python in their work. In line with our ongoing mission to support the open-source data science ecosystem, we've made the love story between Rand Python a happier one.

Posit has focused on helping teams tackle key challenges of bilingual environments by making it easy to combine R and Python in a single data science project. You can launch and manage Jupyter Notebooks, JupyterLab and VS Code in Posit Workbench, and share Jupyter Notebooks, Flask APIs, and interactive Python applications like Streamlit and Dash with your business stakeholders through Posit Connect.