In a previous blog post we discussed how improving interoperability between the multiple environments required by data scientists can improve productivity and ROI on IT investments. With the help of the RStudio Job Launcher, RStudio Team products can make use of IT-managed computing resources in Kuberenetes or Slurm clusters. Data scientists can launch their RStudio, Jupyter, or VS Code sessions directly through RStudio Server Pro. They can also launch remote background jobs to leverage the hardware that is available in their organization’s computing cluster.
However, there are many more job management systems available besides Kubernetes and Slurm. Data scientists may want or need access to the resources available in their organization’s preferred system to run complex analyses. Data science leaders may want to increase productivity in their teams by making use of powerful pre-existing systems. IT departments may want to increase the utilization of job management systems which are expensive to configure and maintain.
The RStudio Job Launcher allows RStudio Team products to integrate with multiple types of job management systems through a Plugin-based system. Each Plugin should allow the Job Launcher to communicate with one type of job management system. The Job Launcher currently has Plugins for integrating with Kubernetes and Slurm, as well as a Plugin which allows jobs to be launched directly on the Job Launcher host.
To support additional job management systems, a new Plugin needs to be developed for each job management system. The RStudio Launcher Plugin SDK (Software Development Kit) facilitates rapid development of these Plugins in C/C++.
A developer can follow along with the QuickStart Guide to transform the RStudio QuickStart Plugin into a functioning Plugin. The QuickStart Guide includes 16 steps, or ‘
TODO items’, that correspond to different features that need to be implemented in the QuickStart Plugin.
TODOs #1 - #4 help the developer get the Plugin renamed and rebranded as desired.
TODO#5 shows how configuration options can be added to the Plugin, although it may become more obvious what options will be needed as development continues.
TODO's #6 - #16 take the developer through the bulk of the work to create a functioning RStudio Job Launcher Plugin.
Provided with the SDK is a utility called “Smoke Test”. The Smoke Test tool can be used during development to trigger many of the major code paths in a Plugin. Debugging a Plugin that is in use by an RStudio Team product can be difficult because the developer is not in control of which API calls are made and when. The Smoke Test makes debugging a Plugin much easier by giving the developer that control.
While following the RStudio Launcher Plugin SDK QuickStart Guide, most
TODOs will follow a similar development process. For example, the development process for
TODO #7 might look like this:
TODO#7: Define cluster configuration
After the developer is getting the desired results from the Smoke Test tool, they may wish to test their Plugin against an RStudio Team product. The Smoke Test tool will only cover the basic pathways of the Plugin.
Some Plugin developers may find it necessary to do something more complex than what is presented in the QuickStart Guide. For example, they may wish to allow administrators to set resource limits on a per-user or per-group basis. In that case, the Plugin developer can turn to the Developer’s Guide. The Advanced Features section of the Developer’s Guide covers optional advanced features that may be added to a Plugin as needed.
The Developer’s Guide also covers the high level architecture of the SDK, how some RStudio Team products integrate with the RStudio Job Launcher, and a detailed description of all the Smoke Test Utility options. Additionally, the API between the RStudio Job Launcher and a Plugin is described in full. If developers prefer to work in a language other than C or C++, a Plugin developer can use the Launcher Plugin API section to develop a Plugin from scratch in any other language. The Launcher Plugin API describes the communication mechanism between the RStudio Job Launcher and a Plugin.
The RStudio Launcher Plugin SDK also includes a complete API Reference for all of its C/C++ code. The API Reference may be useful if the developer wishes to see detailed class hierarchy information or reference doxygen comments outside of the codebase.
The RStudio Launcher Plugin SDK is open source. If you find any bugs or wish to request enhancements, please file an issue on the RStudio Launcher Plugin SDK GitHub Repository. Pull requests for improvements or bug fixes are also welcome!
We are excited to announce that the 2022 Table Contest starts today! We have been blown away by past submissions and can't wait to see what you do this year.
With the vetiver package, data scientists have a streamlined, consistent way to maintain machine learning pipelines. We recently updated our Bike Share prediction application using vetiver and Quarto.