Grow your data science skills at posit::conf(2024)

August 12th-14th in Seattle


The analytics required to bring a novel molecule to market involves the collective effort of hundreds of quantitative scientists, with incredibly diverse training, tools and workflows. In this talk, speakers from Roche/Genentech will discuss the diverse ways R can be used in different stages of the pharma lifecycle, spanning research and discovery, through to development and market access.

Part 1: Enabling collaborative software development with inner and open-source

Michael Lawrence – Scientist, Genentech Research and Early Development

Adapting to the rapidly changing requirements of science requires collaborative software development across the enterprise, industry and field. The principles of open and inner source provide an effective model of collaboration that breaks down silos through decentralized, bottom-up participation. We will present a framework for accommodating grassroots development within a traditional corporate structure.

Part 2: Self-service statistics with Shiny

Sebastian Wolf – Scientific Software Developer, Roche Diagnostics

QA processes in Roche Diagnostics are dependent on statistical evaluations. As such evaluations follow certain standard operating procedures, the biostats department decided to enable users doing them by themselves. The R-Shiny app bioWARP brings standard procedures such as linear regression or equivalence tests to people who cannot code R. It saves them time they would have spent consulting a Biostatistician or using validated Excel sheets.

Part 3: Analyzing Clinical Trials Data using R for Decision Making and Regulatory Submissions

Adrian Waddell – Data Analyst, Roche Pharmaceutical

Using R to generate the data analysis results of clinical trials to be included in a submission with the health authorities requires a complete ecosystem of tools, processes and environments. At Roche, we are currently developing such an ecosystem and, in this presentation, I will introduce some of its components. These include: a set of R-based tools to create tables, listings and graphs (TLGs) ad-hoc; a framework to create custom interactive web-applications using shiny for exploratory analysis that support dynamic subsetting and provide reproducibility of the TLGs. Additionally, I will show a proof-of-concept project for streamlining the creation of datasets and TLGs for health authority submissions.

Part 4: Building a data science team to enable Personalised Healthcare (PHC)

James Black – Associate Director, Personalised Healthcare Data Science, Roche Pharmaceutical

Modelled on the Tidyverse, Roche/Genentech built a suit of interlocking packages to abstract infrastructure and repetition from our analyses, as well as a common philosophy that underpins this ecosystem. We will also talk about how we integrated these packages into a workflow that included meta-data collection, version controlled code, and a unified online results portal. Finally, we will share our experiences, both positive and negative, introducing this new workflow and the culture and competency shift it brought with it.

Subscribe to more inspiring open-source data science content.

We love to celebrate and help people do great data science. By subscribing, you'll get alerted whenever we publish something new.