User Lightning Talks 2017-07-25T15:27:18+00:00

User Lightning Talks

 
Download Materials
information about 2018
Gordon Shotwell
Upworthy

Automatically generating makefiles from RStudio projects. Gordon wrote a basic package which implements an approach that detects the read and writes from various R scripts within an RStudio project. It’s a great way for R users to get started working with Makefiles, and has a lot of potential. Package is located here: https://github.com/GShotwell/easymake

Karthik Ram
rOpenSci

The rOpenSci project was founded in response to the new challenges that scientists face in response to ever increasing access to heterogeneous data. The project aims to develop open source software that lower barriers to data driven discovery and reproducibility for researchers.

Karthik’s team develops and maintains a large suite of R packages that play a key role in various parts of the (tidy and other) data life cycle. In this talk he’ll present 1 minute vignettes on their five most interesting packages developed this year.

Ali Zaidi
Microsoft

The last year has seen phenomenal progress of the R APIs for Spark. RStudio’s sparkapi package and associated dplyr backend, sparklyr, exposes a far richer set of features for calling Spark packages through R than was possible with the original SparkR package. Moreover, Microsoft has also released a new Spark API called RxSpark, which provides a highly performant set of machine learning algorithms and data processing functions for use in a Spark application, and can be used in conjunction with sparklyr and other Spark APIs.

In this talk, Ali will talk about the lessons learned using R with Spark. In particular, he’ll show how to write reproducible documents in Spark, that take advantage of R Markdown’s caching mechanism and RxSpark’s `persistentRun` feature, as well as how to develop shiny applications that work reactively with Spark DataFrames and RxSpark algorithms.

Jonathan Sidi
Metrum Research Group

ggedit is a package that helps users bridge the gap between making a plot and getting all of those pesky plot aesthetics just right, all while keeping everything portable for further research and collaboration. ggedit is powered by a Shiny gadget where the user inputs a ggplot plot object or a list of ggplot objects. The gadget populates shinyBS modals with all the elements found in each layer and theme of the ggplot objects. The user then can edit these elements and interact with the plot as changes occur.

Simon Jackson
University of Sydney

This talk Simon will present corrr, a package for exploring correlations in R. Working with correlations in R can be messy, and their traditional matrix-format doesn’t play nice in a tidyverse framework. This is where corrr can help. The main functionality of corrr is to generate correlations in a specially-formatted tibble, making it possible to explore them using tidyverse tools. A simple API is available to support routine tasks such as focusing on the correlations of certain variables against others or rearranging and visualizing them.

Aimee Gott
Mango Solutions

Analysis of literature data is a common challenge within the pharmaceutical industry where such data is used routinely for the comparison of treatments in development to those in publication data. Much of the analysis is data manipulation and summarization that in R alone is straightforward, But, it can be a long and tedious process as the unique features of each dataset mean it cannot be automated.

Shiny provides the means to simplify and accelerate the process through interactivity. Aimee will highlight some of the key features that could be incorporated into a single application making R and Shiny a vital tool for data processing.

Elaine McVey
TransLoc

TransLoc has created an extensive company-wide analytics dashboard using R. The convenience of flexdashboard, combined with the power of, R Markdown and Highcharts (via highcharter) has led to an easy to maintain, completely customizable, and graphically rich dashboard. Using Google BigQuery has helped us bring together data from many disparate sources, both “big” data and hand-curated information, accessible directly from R. Elaine will describe why this technology stack has worked well for their use case.

Ian Lyttle
Schneider Electric

With the advent of Shiny Modules, developers can create and support apps with more components and more complexity. One of the limiting factors is that we have but one “dimension” of layers using a tabsetPanel in the UI. This was the motivation to develop a second, independent “dimenson” of layers in an “accordion-sidebar” framework. This is one of the function families provided in the bsplus package.

Another aspect of the bsplus package is that it lets you compose HTML using pipes. The functions are designed to help you access Bootstrap Javascript components using only the HTML. It includes collapsible panels, accordions, tooltips, popovers and modals.

Daniel Levy
MyHeritage.com

In this talk Daniel will go into the way MyHeritage.com use R, specifically their science team. They like to use R for the convenience of its data manipulation and analysis, but sometime have to go to other languages to speed things up (Julia), Run some software pipelines (Bash), or just experiment with some open source code which can be in practically any language. But ultimately, MyHeritage.com will always end things with Shiny. In this talk Daniel will give some highlights on the way they go in and out of R, and make things as comfortable for our science team, while giving Shiny visible results to the rest of the company.

Da Zheng
Johns Hopkins University

In the era of big data, R is rapidly becoming one of the most popular tools for data analysis. But the R framework is relatively slow and unable to scale to large datasets. The general approach of speeding up an implementation in R is to implement the algorithms in C or FORTRAN and provide an R wrapper. There are many works that parallelize R and scale it to large datasets.

In this talk, Da will show how their R package called FlashR enables the R framework with automatic parallelization and out-of-core execution as well as extreme efficiency for big data. They aim to hide the complexity of parallelization and out-of-core execution completely from users, and parallelize existing R code and scale it for large datasets automatically. FlashR is implemented as an R package and is released as open source (http://flashx.io/).