Advanced Data Science 2017-07-25T15:27:24+00:00

Advanced Data Science

Materials
Github

Collaboration and time travel: version control with git, github and RStudio.

Understanding git & github give you two data science superpowers:

  • Collaboration: With git and github, you can easily work with others. You no longer have to email files back and forth, or fight over who’s editing which file in dropbox. Instead, you can work independently, and trust git to combine (aka merge) your work.
  • Time travel: git allows you to back in time to before you made that horrific mistake. You can replay history to see exactly what you did, and track a bug back to the moment of its creation. Git even allows you to do the code equivalent of travelling back in time to kill your own grandfather!
Materials
Github

Managing package dependencies in R with Packrat

Have you ever had to use trial-and-error to figure out what R packages you need to install to make someone else’s code work–and then been left with those packages globally installed forever, because now you’re not sure whether you need them? With packrat your R projects will be:

  • Isolated: Installing a new or updated package for one project won’t break your other projects, and vice versa. That’s because packrat gives each project its own private package library.
  • Portable: Easily transport your projects from one computer to another, even across different platforms. Packrat makes it easy to install the packages your project depends on.
  • Reproducible: Packrat records the exact package versions you depend on, and ensures those exact versions are the ones that get installed wherever you go.
Materials
Github

Creating JavaScript data visualizations in R

HTMLwidgets are always hosted within an R package and include all of the source code for their dependencies. This is to ensure that code which depends on widgets doesn’t require an internet connection or the ongoing availability of an internet service to run.

Leaflet and DataTables are two fantastic packages that take advantage of the htmlwidgets framework. Using Leaflet and Data Tables as examples this webinar will show you how to use JavaScript visualization libraries at the R console, just like plots. We will also embed widgets in an R Markdown document and a Shiny web application. If we have time we will also show you that by following a small set of easy-to-follow conventions, it is easy to create your own htmlwidgets.

Understanding htmlwidgets and how you can leverage packages like Leaflet and Data Tables will help you create stunning visualizations that are interactive and compelling, but most of all – require very little code.