This repository is for my progress in Udemy course Apache Spark for Java Developers created by Richard Chesterwood and Matt Greencroft. During this course I am trying to learn as much as possible about Apache Spark.
- what are RDDs and how to use them
- how to use reduces, mapping, flatMaps, filters, ...
- how to work with Tuples in Java
- working with file system in Apache Spark
- how to and not to use sorts and coalesce
- how to properly work with joins
- (Exercise) Tag Generator based on occurences
- (Project) Ranking Video Training Courses