View Code? Open in Web Editor
NEW
This project forked from pushkr/apache-spark-hands-on
Educational notes,Hands on problems w/ solutions & interview questions for hadoop ecosystem
Shell 16.05%
Python 65.80%
Jupyter Notebook 9.13%
Scala 9.02%
apache-spark-hands-on's Introduction
/Flume
: contains notes and examples of apache flume
/Hive
: contains notes and examples of apache hive
/MySQL
: code sample containing peices to create db, create table and load data in mysql
/Sqoop
: contains notes and examples of import/export using sqoop
/spark
: contains notes,documentation, sample example(s) of spark APIs
/exam
: sample cca-175 exam questions and solutions (in solution branch)
/problem1
- complex data structure handling using hive. (exposure to Hive,create table,LOAD,named_struct,struct)
/problem2
- Stock data analysis. (exposure to : json file handing, SparkSQL,map,reduce,filter,join,groupByKey,keyBy,UDFs etc)
/problem3
- MovieLens database analysis
/problem4
- Lahman's baseball database analysis
/problem5
- Hortonworks certification sample. Total 10 tasks .
/Tweeter
- Tweeter data analysis
/problem6
- Retail database sample excercises
My Answers to few PySpark Questions on StackOverFlow : Link
apache-spark-hands-on's People
Contributors
Watchers