Accelerating bioinformatic learning

Nicholas Provart / Professor, Plant Cyberinfrastructure & Systems Biology

The University of Toronto worked with Coursera to deliver two massive, open, online courses (https://www.coursera.org/learn/bioinformatics-methods-1 and https://www.coursera.org/learn/bioinformatics-methods-2) to teach students bioinformatic methods, in order to better understand biology. One area in this field that is important is understanding when and where genes are active, based on “RNA-seq” data. Students explore this aspect in the 5th week of the second course.

"RStudio Server Pro gave us the ability to have multiple users at once logged into Bioconductor, each with his/her own workspace. Much easier to manage the complete system and user experience.”

- Nicholas Provart Professor, Plant Cyberinfrastructure & Systems Biology, University of Toronto

The Challenge


Analyzing RNA-seq data can be done with Bioconductor. In the first iteration of the course, the instructor, Nicholas Provart, had the students install R and Bioconductor on their own computers. This proved to be a stumbling point because of the many different operating systems and dependencies between different packages on different platforms.

The Solution


RStudio Server Pro offered a great solution. By being able to set up a web-accessible instance of Bioconductor running on RStudio on Amazon Web Services, students could proceed to learn the methods for analyzing RNA-seq data instead of spending time futzing with installation issues. “We used an AWS c4.large VM as in the schematic below to run up to 16 RStudio Server Pro instances, depending on the load. A php-based LDAP server was used to handle the authentication for student logins.”

Why RStudio Server Pro ?


RStudio Server Pro offered the ability to have multiple users connect to the Bioconductor environment, each with his/her own workspace where work can be maintained from session to session and pre-existing data folders can be shared with users.

The Payoff


Far fewer complaints on the Discussion Forums associated with the RNA-Seq lab and happier students! Professor Provart is also using the online course material to teach his University of Toronto class, on which these courses were based. By moving the Bioconductor component to the “cloud” and using RStudio Server Pro ,systems administration headaches with University of Toronto computer labs were also alleviated. The University of Toronto is excited to be teaching so many students and educating future bioinformatic professionals on state-of-the-art analytics: 43,000 students have enrolled in the 2nd course so far!

"I have just completed Bioinformatics Methods II, and it was just as informative as the first one. Not only does it build off of knowledge from the first it has also taught me useful lab based analyses regarding RNA seq and microarray, not to mention protein-protein interactions. I would like to thank you for continuing to teach bioinformatics to students like myself, I will use a good portion of these techniques in my field of study day to day.”

- A Bioinformatic Methods II Student

RStudio provides open source and enterprise-ready professional software for data science teams using R and Python.