Data science leaders have embraced the work-from-home era created by COVID-19. Most data science teams have continued their work either using their company laptops or server-based IDEs such as RStudio Server. However, these home workers often run into the limitations of their laptops when they:
The key to freeing data scientists from laptop limitations is to embrace server-based development, as we noted in a prior post, Equipping Work From Home Data Science Teams. Providing data scientists with access to a server-based IDE like RStudio Server can give them more processors, cores, memory, and architecture options than would be available on their laptops. Additionally, with RStudio Server Pro, data scientists can go even further by launching interactive or batch sessions on SLURM and Kubernetes clusters.
As shown in Figure 1, RStudio offers three ways for data scientists to take advantage of centralized resources and escape the limitations of their laptops:
|RStudio Server||Interactive Launcher Sessions on RStudio Server Pro||Launcher Jobs on RStudio Server Pro|
|Typical RAM||Tens to hundreds of gigabytes||Multiple terabytes||Multiple terabytes||Typical Processor Cores||Tens||Hundreds to Thousands||Hundreds to Thousands||Typical Jobs||Routine analyses||Interactive tasks requiring large compute, GPUs, or RAM such as exploratory data analysis||Batch tasks like parameter tuning, ETL, or model training and scoring|
|Setup required||RStudio Server install||RStudio Server Pro + Cluster add-in||RStudio Server Pro + Cluster add-in|
|Limitations||Server Resources||Best for interactive work, not parallel tasks||Jobs kicked off manually, limited job feedback|
Figure 3: Three Ways to Expand Data Science Computational Resources Using RStudio Pro and Launcher.
Data scientists benefit from using RStudio Server and RStudio Server Pro for their analysis because:
To learn more about the new Launcher capabilities built into RStudio:
If you’d like to try out RStudio Server Pro for your team, you can learn how to download an evaluation copy from the RStudio Server Pro product page.
2021 INSPIRE U2 participants Kathleen Bostic and Michel Ruiz-Fuentes reflect on their experience in the program, where they developed their data science skills with the support of faculty and...
Three key attributes define Serious Data Science; open-source software, code-first development, and a centralized data science infrastructure. This approach has been successful at driving value and...