Data Science Essentials 2017-07-25T15:27:24+00:00

Data Science Essentials

The Grammar and Graphics of Data Science.

Data science is the process of turning data into understanding and actionable insight. Two key data science tools are data manipulation and visualization. Learn how you can easily munge data with dplyr (even if it’s still in a database), and create interactive visualizations with ggvis.

  • dplyr: a grammar of data manipulation – Hadley Wickham
  • ggvis: Interactive graphics in R – Winston Chang

Reproducible Reporting

It doesn’t matter how great your analysis is unless you can share it with others – easily. R Markdown and knitr make it easy to intermingle code and text to generate compelling reports and presentations that are never out of date. Combine R Markdown with packrat to ensure that your reports are reproducible day in and day out, no matter what other R packages you have installed.

  • The Next Generation of R Markdown – Jeff Allen
  • Knitr Ninja – Yihui Xie
  • Packrat – A Dependency Management System for R – J.J. Allaire & Kevin Ushey

Interactive Reporting

In a static report, you answer known questions. With a dynamic report, you give the reader the tools to answer their own questions. Get started by learning how to make your R Markdown documents interactive, and then unleash the full flexibility of analytic app development with shiny.

  • Embedding Shiny Apps in R Markdown documents – Garrett Grolemund
  • Shiny: R made interactive – Joe Cheng

Data wrangling with R and RStudio.

Before an R program can look for answers, your data must be cleaned up and converted to a form that makes information accessible. In this webinar, you will learn how to use the `dplyr` and `tidyr` packages to optimise the data wrangling process. You’ll learn to:

  • Spot the variables and observations within your data
  • Quickly derive new variables and observations to explore
  • Reshape your data into the layout that works best for R
  • Join multiple data sets together
  • Use group-wise summaries to explore hidden levels of information within your data

Getting your Data into R

You can’t use R for data analysis unless you can get your data into R. Getting your data into R can be a major hassle, so in the last few months Hadley has been working hard to make it easier.

In this webinar Hadley will discuss the places you most often find data (databases, excel, text files, other statistical packages, web apis, and web pages) and the packages (DBI, xml2, jsonlite, haven, readr, exel) that make it easy to get your data into R.