This repository contains example code and sample data for An Introduction to Real time Spark session. Follow the below steps to clone code and setup your machine.
- Java
- Maven 3
- Netcat
If you are linux, please make sure you have nc command. On windows, please install ncat. This tool is required for socket examples.
git clone https://github.com/phatak-dev/introduction-to-spark-streaming
mvn clean install
On Linux, Run nc as follows
nc -l localhost 50050
then run the following command from code directory
java -cp target/spark-streaming.jar com.madhukaraphatak.sparktraining.streaming.WordCount local[2] localhost 50050
On Windows, first find out your ip address. Then run netcat as below
nectcat -l <ip-address> 50050
then run the following command from code directory
java -cp target\spark-streaming.jar com.madhukaraphatak.sparktraining.streaming.WordCount local[2] <ip-address> 50050
You can run all the examples from terminal. If you want to run from the IDE, follow the below steps
- IDEA 14
Install scala plugin. Once plugin is loaded you can load it as maven project.
Please pull before coming to the session to get the latest code.