Instructions on how to run:
Run: spark-submit parse.py <search_term>
Where <search_term> is the term you want to search on.
Make sure that you have the directory of small_pages in the same working directory. If you want to use the cluster, use parse-big.py. That is set to use the cluster hdfs.
The results will be returned in a format suitable for programmers. A demo of the web version will be showed at another time where the results will be a little more human usable.