• Channels

  • Contact

  • Main Site

  • More

    1. Elastacloud Channels
    2. Community
    3. EARL Highlights!
    Search
    darshna
    Sep 28, 2017
      ·  Edited: Sep 29, 2017

    EARL Highlights!


    Around two weeks ago the Elastacloud data science team attended the EARL conference in London. The first day consisted of several workshops (Spark and R with sparklyr, web scraping and text analysis in R, writing R functions for fun and profit, Introduction to Shiny, working with GitHub, working with the MicrosoftML package). Most of us attended Spark and R with sparklyr in the morning and working with the MicrosoftML package in the afternoon. The next two days consisted of simultaneous streams of talks on wide variety of R applications. Some of my personal highlights included: the keynote talk by Jenny Bryan on workflows, Dr Joy McKenny’s talk on using R in to monitor sewer network performance for the water industry, Simon Field’s talk on how to develop data scientist super powers, and lastly but by no means least the Sparklyr workshop. Some highlights and code snippets of the sparklyr workshop are illustrated below.


    R computation is single threaded and memory bound. Sparklyr is a R interface to Apache Spark. As shown in examples below sparklyr converts nice readable dplyr code to SQL queries to allow one to explore and wrangle large volumes of data, and MLlib to assist the user to develop machine learning models.




    One of the latest features of sparklyr 0.6 allows the user to run arbitrary R code on all nodes, as illustrated below.


    We had a great time bonding, learning and seeing what great things the R community has produced in various industries.




    • Twitter Social Icon
    • LinkedIn Social Icon
    • Facebook Social Icon

    Visit the Elastacloud website