GithubHelp home page GithubHelp logo

mapr-demos / spark-sklearn-airbnb-predict Goto Github PK

View Code? Open in Web Editor NEW
44.0 11.0 34.0 376 KB

Code example to predict prices of Airbnb vacation rentals, using scikit-learn on Spark with spark-sklearn, on MapR.

Jupyter Notebook 99.23% Python 0.77%

spark-sklearn-airbnb-predict's Introduction

spark-sklearn-airbnb-predict

Code example to predict prices of Airbnb vacation rentals, using scikit-learn on Spark.

The Jupyter notebook in this repo contains examples to run regression estimators on the Inside Airbnb listings dataset from San Francisco. The target variable is the price of the listing. To speed up the hyperparameter search, the notebook shows examples that use the spark-sklearn package to distribute GridSearchCV across nodes in a Spark cluster. This provides a much faster way to search and can lead to better results.

To run the scikit-learn examples (without Spark) the following packages are required:

  • Python 2
  • Pandas
  • NumPy
  • scikit-learn (0.17 or later)

These can be installed on the MapR Sandbox.

To run the scikit-learn examples with Spark, the following packages are required on each machine:

  • All of the above packages
  • Spark (1.5 or later)
  • spark-sklearn -- follow the installation instructions there

You can run this on a MapR cluster by following one of these methods:

Run the script with:

MASTER=yarn-client /opt/mapr/spark/spark-1.5.2/bin/spark-submit --num-executors=4 --executor-cores=8 python_scikit_airbnb.py

(setting num-executors and executor-cores to suit your environment)

The file classify.py in this repo contains an example of classification on the same dataset, using reviews.csv and text analysis.

and of course... have fun!

spark-sklearn-airbnb-predict's People

Contributors

namato avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.