GithubHelp home page GithubHelp logo

kforem / movie-rec Goto Github PK

View Code? Open in Web Editor NEW

This project forked from edersoncorbari/movie-rec

0.0 1.0 0.0 2.29 MB

Movie Recommendation System Using Spark ML, Akka and Cassandra

Home Page: https://edersoncorbari.github.io/tutorials/building-spark-ml-recommendation-system

Scala 96.37% Shell 3.63%

movie-rec's Introduction

Movie Rec

N|Solid

A simple Demo of a Movie Recommendation System for Big Data. Scalable development using Spark ML (Machine Learning), Cassandra and Akka technologies.

Synopsis

This is a project developed for studies. Using Machine Learning, applying the Spark ML Collaborative Filtering model. The system consists of an API Rest, with two endpoints. The first endpoint trains the model, the second endpoint returns a list of movie recommendations to a user using their UUID.

More detailed information can be found from the sites below:

Architecture

The project architecture uses Akka, Spark and Cassandra, these components can work in a distributed way.

Data Model

The keyspace is called movies. The data in Cassandra is modeled as follows:

Organization:

Collection Comments
movies.uitem Contains available movies, total dataset used is 1682.
movies.udata Contains movies rated by each user, total dataset used is 100000.
movies.uresult Where the data calculated by the model is saved, by default it is empty.

Rest Server End-Points

The end-points available on the Rest Server are:

Method End-Point Comments
POST /movie-model-train Do the training of the model.
GET /movie-get-recommendation/{ID} Lists user recommended movies.

Quick start

You need to install SBT on your machine and create a docker for Cassandra.

1. Get the code

Now run the commands below to compile the project:

$ git clone https://github.com/edersoncorbari/movie-rec.git
$ cd movie-rec

2. Docking and Configuring Cassandra

The Cassandra version used was 3.11.4, possibly works on any version up.

$ docker pull cassandra:3.11.4

Creating and running the docker. The application is ready to use these settings.

docker run --name cassandra-movie-rec -p 127.0.0.1:9042:9042 -p 127.0.0.1:9160:9160 -d cassandra:3.11.4

Make sure it's up and running.

$ docker ps | grep cassandra

You can also try to check like this:

docker exec -it cassandra-movie-rec uname -a

The answer should be:

Linux 883a6daf0c2d 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:12:28 UTC 2019 x86_64 GNU/Linux

In the project directory (movie-rec) there are the datasets already prepared to put in Cassandra.

$ cat dataset/ml-100k.tar.gz | docker exec -i cassandra-movie-rec tar zxvf - -C /tmp

Creating the schema and loading the datasets:

$ docker exec -it cassandra-movie-rec cqlsh -f /tmp/ml-100k/schema.cql

3. Verifying the data

Enter the Cassandra console using CQLSH and verify the data:

$ docker exec -it cassandra-movie-rec cqlsh

The syntax is similar to our old known SQL:

cqlsh> use movies;
cqlsh:movies> select count(1) from uitem; -- Must be: 1682
cqlsh:movies> select count(1) from udata; -- Must be: 100000
cqlsh:movies> describe uresult;

4. Running the Project

It is important before setting the Spark variable:

$ export SPARK_LOCAL_IP="127.0.0.1"

Enter the project root folder and run the commands, if this is the first time SBT will download the necessary dependencies.

$ sbt update compile test run

Rock and roll! The Akka Http is running with Spark.

Note: You can use the curl command directly, but jsoncurl makes json's response pretty!

Now! In another terminal run the command to train the model:

$ curljson -XPOST http://localhost:8080/movie-model-train

The answer should be:

{
  "msg": "Training started..."
}

This will start the model training. You can then run the command to see results with recommendations. Example:

$ curljson -XGET http://localhost:8080/movie-get-recommendation/1

Note: The number parameter at the end is the uuid of a user, you look for other ids in Cassandra and test.

The answer should be:

{
    "items": [
        {
            "datetime": "Thu Oct 03 15:37:34 BRT 2019",
            "movieId": 613,
            "name": "My Man Godfrey (1936)",
            "rating": 6.485164882121823,
            "userId": 1
        },
        {
            "datetime": "Thu Oct 03 15:37:34 BRT 2019",
            "movieId": 718,
            "name": "In the Bleak Midwinter (1995)",
            "rating": 5.728434247420009,
            "userId": 1
        },
        {
            "datetime": "Thu Oct 03 15:37:34 BRT 2019",
            "movieId": 745,
            "name": "Ruling Class, The (1972)",
            "rating": 6.768538846961009,
            "userId": 1
        },
        {
            "datetime": "Thu Oct 03 15:37:34 BRT 2019",
            "movieId": 1056,
            "name": "Cronos (1992)",
            "rating": 5.812607594988232,
            "userId": 1
        },
        {
            "datetime": "Thu Oct 03 15:37:34 BRT 2019",
            "movieId": 1137,
            "name": "Beautiful Thing (1996)",
            "rating": 7.145126009205107,
            "userId": 1
        },
        {
            "datetime": "Thu Oct 03 15:37:34 BRT 2019",
            "movieId": 1154,
            "name": "Alphaville (1965)",
            "rating": 6.196922528078046,
            "userId": 1
        },
        {
            "datetime": "Thu Oct 03 15:37:34 BRT 2019",
            "movieId": 1205,
            "name": "Secret Agent, The (1996)",
            "rating": 6.041159524014422,
            "userId": 1
        },
        {
            "datetime": "Thu Oct 03 15:37:34 BRT 2019",
            "movieId": 1269,
            "name": "Love in the Afternoon (1957)",
            "rating": 6.529481757021213,
            "userId": 1
        },
        {
            "datetime": "Thu Oct 03 15:37:34 BRT 2019",
            "movieId": 1449,
            "name": "Pather Panchali (1955)",
            "rating": 5.95622158882095,
            "userId": 1
        },
        {
            "datetime": "Thu Oct 03 15:37:34 BRT 2019",
            "movieId": 1475,
            "name": "Bhaji on the Beach (1993)",
            "rating": 5.929892254811888,
            "userId": 1
        }
    ]
}

That’s icing on the cake! Remember that the setting is set to show 10 movies recommendations per user.

You can also check the result in the uresult collection:

5. Model Predictions

The model and application training settings are in: (src/main/resources/application.conf)

model {
  rank = 10
  iterations = 10
  lambda = 0.01
}

The model uses the Alternating Least Squares (ALS) algorithm. This setting controls forecasts and is linked with how much and what kind of data we have. Check more: Spark Collaborative Filtering

6. References

To development this demonstration project the books were used:

6.1. Scala Machine Learning Projects

Check out Chapter: 4. Model-based Movie Recommendation Engine.

Available:

6.2. Reactive Programming with Scala and Akka

Available:

movie-rec's People

Contributors

edersoncorbari avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.