Modeling in the Tidyverse

Download Materials


R has extremely rich and diverse modeling capabilities. However, many packages have a variety of interfaces and differing syntactical conventions. Using them in the context of Hadley’s tidy data conventions can be difficult. This talk will discuss the process of making more modular and programming friendly code for modeling activities. Using the `caret` package as an example, a broad roadmap will be discussed for making the transition to more focused packages that use tidy ideas. The concept of writing a modeling _specification_ that can be used in different compute engines will also be discussed.

About the speaker

Max Kuhn
Software Engineer & Data Scientist

Max was a nonclinical statistician for 12 years in the pharmaceutical industry and for 6 years in the medical diagnostic industry. His degrees are in Biostatistics (Ph.D.) and Mathematics (B.S.). He has released several R packages for predictive modeling and machine learning, including caret, C50, and Cubist. He is the author of the Springer book Applied Predictive Modeling (with Kjell Johnson), which won the American Statistical Association’s Ziegel Award for best book in 2014.