GithubHelp home page GithubHelp logo

ml-ai-nlp-ir / implicit Goto Github PK

View Code? Open in Web Editor NEW

This project forked from benfred/implicit

0.0 2.0 0.0 389 KB

Fast Python Collaborative Filtering for Implicit Datasets

License: MIT License

Python 95.18% C++ 4.82%

implicit's Introduction

Implicit

Build Status Windows Build Status

Fast Python Collaborative Filtering for Implicit Datasets.

This project provides fast Python implementations of the algorithms described in the paper Collaborative Filtering for Implicit Feedback Datasets and in Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering.

To install:

pip install implicit

Basic usage:

import implicit

# initialize a model
model = implicit.als.AlternatingLeastSquares(factors=50)

# train the model on a sparse matrix of item/user/confidence weights
model.fit(item_user_data)

# recommend items for a user
recommendations = model.recommend(userid, item_user_data.T)

# find related items
related = model.similar_items(itemid)

The examples folder has a program showing how to use this to compute similar artists on the last.fm dataset.

Articles about Implicit

Several posts have been written talking about using Implicit to build recommendation systems:

There are also a couple posts talking about the algorithms that power this library:

Requirements

This library requires SciPy version 0.16 or later. Running on OSX requires an OpenMP compiler, which can be installed with homebrew: brew install gcc.

Why Use This?

This library came about because I was looking for an efficient Python implementation of this algorithm for a blog post on matrix factorization. The other python packages were too slow, and integrating with a different language or framework was too cumbersome.

The core of this package is written in Cython, leveraging OpenMP to parallelize computation. Linear Algebra is done using the BLAS and LAPACK libraries distributed with SciPy. This leads to extremely fast matrix factorization.

On a simple benchmark, this library is about 1.8 times faster than the multithreaded C++ implementation provided by Quora's QMF Library and at least 60,000 times faster than implicit-mf.

A follow up post describes further performance improvements based on the Conjugate Gradient method - that further boosts performance by 3x to over 19x depending on the number of factors used.

This library has been tested with Python 2.7 and 3.5. Running 'tox' will run unittests on both versions, and verify that all python files pass flake8.

Optimal Configuration

I'd recommend configure SciPy to use Intel's MKL matrix libraries. One easy way of doing this is by installing the Anaconda Python distribution.

For systems using OpenBLAS, I highly recommend setting 'export OPENBLAS_NUM_THREADS=1'. This disables its internal multithreading ability, which leads to substantial speedups for this package.

Released under the MIT License

implicit's People

Contributors

benfred avatar escherba avatar dirtysalt avatar jbochi avatar danieljl avatar ds2268 avatar

Watchers

Apurv Verma avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.