As we’re putting the finishing touches on the RStudio Workbench 2021.09.0 “Ghost Orchid” release, we’d like to share one of the new sets of features we’re most excited about. We’ve revisited and revamped the administration experience for load balancing clusters.
Specifically, we’ve worked to improve the cluster management and troubleshooting. To make this possible, cluster data is now stored within the internal database. The load balancing configuration file no longer requires a list of each node in the cluster. In fact, the file can be completely empty - though its presence is required. This means nodes can join and leave the cluster without bringing down and re-configuring every node - scaling your cluster has never been easier!
When provided an empty configuration file, RStudio Workbench predicts the address that other nodes can reach each node at. For more complicated configurations, we’ve included an escape hatch through the new
www-host-name option which be can included in the file to instruct RStudio Workbench to use a specified hostname. A detailed explanation of the approach taken to determine each node’s address and the new option can be found in the Admin Guide.
Furthermore, we’ve added several new commands to the
rstudio-server admin tool to improve load balancing cluster management.
The first command,
rstudio-server list-nodes displays each node and information about its current status. It is intended to be use in conjunction with the existing status endpoint (accessed through
curl http://localhost:8787/load-balancer/status) to monitor the status of your nodes and aid in identifying and addressing issues.
The following is an example of this output:
$ sudo rstudio-server list-nodes Cluster ------- Protocol Http Nodes ----- ID Host IPv4 Port Status Last Seen 1 rsw-primary 188.8.131.52 80 Online 2021-Sep-20 17:08:53 2 rsw-secondary 184.108.40.206 80 Invalid secure cookie key 2021-Sep-20 17:10:25 3 rsw-tertiary 220.127.116.11 80 Offline 2021-Sep-20 17:10:34
Because load balancing now makes use of the internal database, each node validates its secure cookie key and configured protocol against the database before coming online. The first node online sets the values used for validation. The results of that validation are stored in the database and easily retrievable through the
rstudio-server list-nodes command, allowing for easy troubleshooting when encountering unexpected issues with your cluster.
We’ve added the command
rstudio-server reset-cluster to reset the cluster’s state used for validation. This should be run after replacing the secure cookie key on each node or after updating the protocol the cluster is using (
https-no-verify). Again, the first node brought online or restarted after this reset will determine the configuration used for validation.
Finally, the command
rstudio-server delete-node <node-id> allows you to easily remove nodes from the cluster. The required
node-id parameter can be retrieved from the output of the
rstudio-server list-nodes command. When a node is deleted, the other nodes in the cluster will no longer try to contact that node; there is no need to restart the active nodes after running this. This command should only be used for nodes that are offline and will not be coming back online.
There are many more features coming with this release. If you’re interested in giving them a try, check out the RStudio 2021.09.0 Preview for the latest installers and release notes.
Helpful tips for creating collaborative bilingual data science teams.
With Quarto, you can render plain text and mixed formats into different types of content. We highlight six productivity hacks that may be useful to you.