New training on Big Data Processing and Apache Spark

We are offering a new training course where students will learn the basis of Big Data Processing by using the most popular framework nowadays, Apache Spark. Spark is an open-source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance. 

The training includes basic data management through both resilient distributed databases (RDDs) and dataframes; real-time data ingestion through Spark Streaming; querying Spark data via Spark SQL; and basic machine learning by using ML Pipelines.

More info.