Modeling

With more learning occurring virtually or in hybrid mode, hands-on ways to remotely teach DS are invaluable.

rstudio::global 2021

With more learning occurring virtually or in hybrid mode, hands-on ways to remotely teach DS are invaluable.

January 21, 2021

In order to truly collaborate in multi-disciplinary data-driven teams, one needs to consider how to collaborate beyond R.

tidymodels/stacks, Or, In Preparation for Pesto: A Grammar for Stacked Ensemble Modeling

rstudio::global 2021

tidymodels/stacks, Or, In Preparation for Pesto: A Grammar for Stacked Ensemble Modeling

January 21, 2021

Through a community survey conducted over the summer, the RStudio tidymodels team learned that users felt the

How I became a Data Composer – examples of simulated datasets that bring value to a data-driven company

rstudio::global 2021

How I became a Data Composer – examples of simulated datasets that bring value to a data-driven company

January 21, 2021

How can I get the buy-in from business partners to use more advanced techniques? What can I do to make a data project involving several teams more efficient? And how can I train analysts who do not (yet) have access to sensitive data?

Categorical Embeddings: New Ways to Simplify Complex Data

rstudio::global 2021

Categorical Embeddings: New Ways to Simplify Complex Data

January 21, 2021

Categorical embeddings are a relative new method, utilizing methods popularized in Natural Language Processing that help models solve this problem and can help you understand more about the categories themselves.

Fairness and Data Science: Failures, Factors, and Futures

rstudio::global 2021

Fairness and Data Science: Failures, Factors, and Futures

January 21, 2021

In recent years, numerous highly publicized failures in data science have made evident that biases or issues of fairness in training data can sneak into, and be magnified by, our models, leading to harmful, incorrect predictions being made once the models are deployed into the real world.

What's new in tidymodels?

rstudio::global 2021

What's new in tidymodels?

January 21, 2021

tidymodels is a collection of packages for modeling using a tidy interface.

Using R to Up Your Experimentation Game

rstudio::global 2021

Using R to Up Your Experimentation Game

January 21, 2021

Have you ever cut an A/B test short? Maybe because of traffic constraints, your antsy boss, or early successful results.

Total Tidy Tuning Techniques

rstudio::conf 2020

Total Tidy Tuning Techniques

February 12, 2020

Many models have structural parameters that cannot be directly estimated from the data. These tuning parameters can have a significant effect on model performance and require some mechanism for...

Stochastic Block Models with R: Statistically rigerous clusting with rigorous code

rstudio::conf 2020

Stochastic Block Models with R: Statistically rigerous clusting with rigorous code

January 31, 2020

Often a machine learning research project starts with brainstorming, continues to one-off scripts while an idea forms, and finally, a package is written to disseminate the product.

Neural Networks for Longitudinal Data Analysis

rstudio::conf 2020

Neural Networks for Longitudinal Data Analysis

January 31, 2020

Longitudinal data (or panel data) arise when observations are recorded on the same individuals at multiple points in time.

MLOps for R with Azure Machine Learning

rstudio::conf 2020

MLOps for R with Azure Machine Learning

January 31, 2020

Azure Machine Learning service (Azure ML) is Microsoft’s cloud-based machine learning platform that enables data scientists and their teams to carry out end-to-end machine learning workflows at scale.

Why TensorFlow eager execution matters

rstudio::conf 2019

Why TensorFlow eager execution matters

January 25, 2019

In current deep learning with Keras and TensorFlow, when you've mastered the basics and are ready to dive into more involved applications (such as generative networks, sequence-to-sequence or...

Visualizing uncertainty with hypothetical outcomes plots

rstudio::conf 2019

Visualizing uncertainty with hypothetical outcomes plots

January 25, 2019

Uncertainty is a key component of statistical inference. However, uncertainty is not easy to convey effectively in data visualizations. For example, viewers have a tendency to...

Solving the model representation problem with broom

rstudio::conf 2019

Solving the model representation problem with broom

January 25, 2019

The R objects used to represent model fits are notoriously inconsistent, making data analysis inconvenient and frustrating. The broom package resolves this issue by defining a consistent way to...

parsnip: A tidy model interface

rstudio::conf 2019

parsnip: A tidy model interface

January 24, 2019

parsnip is a new tidymodels package that generalizes model interfaces across packages. The idea is to have a single function interface for types of specific models (e.g. logistic regression) that...