GithubHelp home page GithubHelp logo

dataquery / predict-next-purchase Goto Github PK

View Code? Open in Web Editor NEW

This project forked from alteryx/open_source_demos

0.0 1.0 0.0 40 KB

Predict what a customer will buy next based on purchase history using automated feature engineering

License: BSD 3-Clause "New" or "Revised" License

Jupyter Notebook 87.85% Python 12.15%

predict-next-purchase's Introduction

Predicting a customer's next purchase using automated feature engineering

Featuretools

As customers use your product, they leave behind a trail of behaviors that indicate how they will act in the future. Through automated feature engineering we can identify the predictive patterns in granular customer behavioral data that can be used to improve the customer's experience and generate additional revenue for your business.

In this tutorial, we show how Featuretools can be used to perform feature engineering on a multi-table dataset of 3 million online grocery orders provided by Instacart to train an accurate machine learning model to predict what product a customer buys next.

Note: If you are running this notebook yourself, refer to the read me on Github for instructions to download the Instacart dataset

Highlights

  • We automatically generate 150+ features using Deep Feature Synthesis and select the 20 most important features for predictive modeling
  • We build a pipeline that it can be reused for numerous prediction problems (you can try this yourself!)
  • We quickly develop a model on a subset of the data and validate on the entire dataset in a scalable manner using Dask.

Read the tutorial

Link to notebook: Tutorial

Running the tutorial

  1. Clone the repo
git clone https://github.com/Featuretools/predict_next_purchase.git
  1. Install the requirements
pip install -r requirements.txt
  1. Download the data

You can download the data directly from Instacart here.

After downloading the data save the CSVs to a directory called data in the root of this repository. Then run the following command in your terminal from the root of this repo.

>> python process_data.py
 70%|██████████████████████████▌           | 145/207 [07:43<03:18,  3.20s/it]

Expect this command to take up to 20 minutes to run as it prepares the data for the tutorial notebook

Feature Labs

Featuretools

Featuretools was created by the developers at Feature Labs. If building impactful data science pipelines is important to you or your business, please get in touch.

predict-next-purchase's People

Contributors

bschreck avatar chriskaschner avatar kmax12 avatar prateekmantha avatar realxujiang avatar seth-rothschild avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.