RStudio is becoming Posit in October. Learn more at posit.co
RStudio is becoming Posit in October.
Learn more at posit.co
The premier IDE for R
RStudio anywhere using a web browser
Put Shiny applications online
Shiny, R Markdown, Tidyverse and more
Next level training for you and your team
Do, share, teach and learn data science
An easy way to access R packages
Let us host your Shiny applications
A single home for R & Python Data Science Teams
Scale, develop, and collaborate across R & Python
Easily share your insights
Control and distribute packages
RStudio Public Package Manager
RStudio Package Manager
Datasets in Reproducible Research with 'pins'
February 4, 2020
Open source code is an essential piece in making science reproducible. Tools like 'rmarkdown' and GitHub facilitate running and sharing outcomes with colleagues and with the broad scientific community at large. However, it is less clear what tools should be used to retrieve, store and share datasets; while it is possible to make datasets part of your workflows today, it is usually hard and we are often left with manually sharing or downloading links to datasets. Not only that, but it's also hard to share or discover datasets. In this talk we will introduce for the first time the 'pins' package. A package designed to: pin, discover and share resources. Meaning that, you can use 'pins' to simplify your data science workflows by easily fetching resources from GitHub, Kaggle, CRAN and RStudio Connect. We will present a 'pin' as a generic resource that can contain tabular datasets like CSVs, unstructured data like JSON files, image archives as ZIP files and so on. This talk will be highly interactive showing you how to get started by installing 'pins' from CRAN, retrieve and cache resources, share and discover useful and fun data resources to improve and enhance your day-to-day workflows.
A 5 minute presentation in our Lightning Talks series
Javier is the author of “Mastering Spark with R”, pins, sparklyr, mlflow and torch. He holds a double degree in Math and Software Engineer and decades of industry experience with a focus on data analysis. Javier is currently working on a project of his own; and previously worked in RStudio, Microsoft Research and SAP.