Grow your data science skills at posit::conf(2024)

August 12th-14th in Seattle


In February 2020, the Digital Proactive Process Analytics (DPPA) group within Merck’s manufacturing division officially launched a Shiny app to automate the creation of Continuous Process Verification (CPV) reports into production. That’s right – the almighty, mysterious, coveted production. From a technical perspective, the app is nothing particularly special (except other than getting LaTeX successfully installed to support the use of R Markdown). Users enter a few parameters and out pops a PDF with a series statistical analyses of a product’s quality testing data. The R blogosphere is filled with examples of similar Shiny apps.

What mattered was the app was in production, and furthermore it was approved for GMP use. This meant these reports could be submitted to the FDA and other regulatory agencies. This meant the data could be used to support product release decisions. This meant Merck’s engineers were about to save thousands of hours per year in compiling data, generating charts, and calculating summary statistics. This was the app manufacturing sites needed.

Most of the work in getting this app into production was not implementing the top-level features. Sorry, no discussion of fancy statistical process control methods here. Instead this talk will discuss some of the many things the development team (none of which came from a software development background) needed to learn in order to create a robust, secure, and maintainable production application.

Subscribe to more inspiring open-source data science content.

We love to celebrate and help people do great data science. By subscribing, you'll get alerted whenever we publish something new.