January 21, 2021
In order to truly collaborate in multi-disciplinary data-driven teams, one needs to consider how to collaborate beyond R.
January 21, 2021
Through a community survey conducted over the summer, the RStudio tidymodels team learned that users felt the
January 21, 2021
How can I get the buy-in from business partners to use more advanced techniques? What can I do to make a data project involving several teams more efficient? And how can I train analysts who do not (yet) have access to sensitive data?
January 21, 2021
Categorical embeddings are a relative new method, utilizing methods popularized in Natural Language Processing that help models solve this problem and can help you understand more about the categories themselves.
January 21, 2021
In recent years, numerous highly publicized failures in data science have made evident that biases or issues of fairness in training data can sneak into, and be magnified by, our models, leading to harmful, incorrect predictions being made once the models are deployed into the real world.
January 21, 2021
tidymodels is a collection of packages for modeling using a tidy interface.
January 21, 2021
Have you ever cut an A/B test short? Maybe because of traffic constraints, your antsy boss, or early successful results.
February 12, 2020
Many models have structural parameters that cannot be directly estimated from the data. These tuning parameters can have a significant effect on model performance and require some mechanism for...
January 31, 2020
Often a machine learning research project starts with brainstorming, continues to one-off scripts while an idea forms, and finally, a package is written to disseminate the product.
January 31, 2020
Longitudinal data (or panel data) arise when observations are recorded on the same individuals at multiple points in time.
January 31, 2020
Azure Machine Learning service (Azure ML) is Microsoft’s cloud-based machine learning platform that enables data scientists and their teams to carry out end-to-end machine learning workflows at scale.
January 25, 2019
In current deep learning with Keras and TensorFlow, when you've mastered the basics and are ready to dive into more involved applications (such as generative networks, sequence-to-sequence or...
January 25, 2019
Uncertainty is a key component of statistical inference. However, uncertainty is not easy to convey effectively in data visualizations. For example, viewers have a tendency to...
January 25, 2019
The R objects used to represent model fits are notoriously inconsistent, making data analysis inconvenient and frustrating. The broom package resolves this issue by defining a consistent way to...
January 24, 2019
parsnip is a new tidymodels package that generalizes model interfaces across packages. The idea is to have a single function interface for types of specific models (e.g. logistic regression) that...