Webinar: Parallelize R Code Using Apache® Spark™ on August 15th, 2017

R is the latest language added to Apache Spark, and the SparkR API is slightly different from PySpark. SparkR’s evolving interface to Apache Spark offers a wide range of APIs and capabilities to Data Scientists and Statisticians. With the release of Spark 2.0, and subsequent releases, the R API officially supports executing user code on distributed data. This is done primarily through a family of apply() functions.

In this Data Science Central webinar, we will explore the following:
• Provide an overview of this new functionality in SparkR
• Show how to use this API with some changes to regular code with apply()
• Focus on how to correctly use this API to parallelize existing R packages
• Consider performance and examine correctness when using the apply() family of functions in SparkR

Speaker:
Hossein Falaki, Software Engineer — Databricks Inc.

Hosted by: Bill Vorhies, Editorial Director — Data Science Central
Title: Parallelize R Code Using Apache® Spark™
Date: Tuesday, August 15th, 2017
Time: 9:00 AM – 10:00 AM PDT

http://newsletter.datasciencecentral.com/click.html?x=a62e&lc=NaT&mc=j&s=sCt&u=F&y=W&

Paul Bunting • Architecture Team Manager and Senior IT Architect

https://softwarearchitect.blog/ • Experienced Team Manager and Senior IT Architect with a demonstrated 25+ years experience developing solutions for the manufacturing and architectural design industries.

Webinar: Parallelize R Code Using Apache® Spark™ on August 15th, 2017

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply