Juliet Hougland
35 days ago
Introduction to Machine Learning on Apache Spark MLlib
Cloudera
Machine Learning
Spark
Speaker: Juliet Hougland, Senior Data Scientist, Cloudera

Spark MLlib is a library for performing machine learning and associated tasks on massive datasets. With MLlib, fitting a machine-learning model to a billion observations can take only a few lines of code, and leverage hundreds of machines. This talk will demonstrate how to use Spark MLlib to fit an ML model that can predict which customers of a telecommunications company are likely to stop using their service. It will cover the use of Spark's DataFrames API for fast data manipulation, as well as ML Pipelines for making the model development and refinement process easier.
Confirm Your Email: Send