This is a question that we at RStudio hear a lot. With the tremendous growth in both languages, and in the application of data science in general, there is a lot of interest and debate over which is the “best” language for data science.
From our founding, RStudio has been dedicated to a couple of key ideas: that it’s better for everyone if the tools used for data science are free and open, and that we love and support coding as the most powerful path to tackle data science. Coding gives current and aspiring data scientists superpowers to tackle the most complex problems, because code is flexible, reusable, inspectable, and reproducible.
With that in mind, at RStudio we don’t judge which language you prefer. We just care that you feel enabled to do great data science. As RStudio’s Chief Data Scientist Hadley Wickham expressed in a recent interview with Dan Kopf: Use whatever makes you happy.
We will talk more about the benefits of coding for data science in a future blog post, but in this post we will briefly examine the debates over R vs. Python, and then share why we believe R and Python can, should and do work beautifully together.
There is a lot of heated discussion over the topic, but there are some great, thoughtful articles as well. Some suggest Python is preferable as a general-purpose programming language, while others suggest data science is better served by a dedicated language and toolchain. The origins and development arcs of the two languages are compared and contrasted, often to support differing conclusions.
For individual data scientists, some common points to consider:
For organizations with Data Science teams, some additional points to keep in mind:
Thus, the focus on “R or Python?” risks missing the advantages that having both can bring to individual data scientists and data science teams. Because of this, many of these articles end up with fairly nuanced conclusions, along the lines of “You need both” or “It depends.” A great example of this view can be found in the above-referenced interview with Hadley Wickham:
Generally, there are a lot of people who talk about R versus Python like it’s a war that either R or Python is going to win. I think that is not helpful because it is not actually a battle. These things exist independently and are both awesome in different ways.
And so the reality is that both languages are valuable, and both are here to stay. This is borne out by our experience. In talking to our customers, we’ve found that many Data Science teams today are bilingual, leveraging both R and Python in their work. In the spirit of Hadley’s Use whatever makes you happy, we’ve worked to make this sometime-rocky relationship a much happier one. We give individual Data Scientists, and the Data Science teams and organizations they are a part of, a smoother path to using both languages side by side, and to address the concerns around complexity or cost that IT teams might have about supporting both.
To learn more about how RStudio supports using R and Python on the same Data Science teams, check out our R and Python Love Story, where we provide information and resources for Data Scientists, Data Science Leaders, and DevOps/IT Leaders grappling with mixed R & Python environments. Or, you check out our recent R and Python Love Story Webinar, where you can watch the recording or download the slides. In future blog posts, we will also talk more about what we’ve seen in real life Data Science teams using R and Python side by side.
Many tools used routinely by software developers can also be useful to data scientists.
Welcome to the RStudio Community Monthly Events Roundup! In this post, we update you on upcoming events happening at RStudio and share how to find the great presentations and talks from last month.