GithubHelp home page GithubHelp logo

stathwang / lcars Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 2.0 931 KB

Location-Content-Aware Recommender System in Python and R

License: MIT License

Python 54.87% R 45.13%
latent-dirichlet-allocation gibbs-sampling topic-modeling probabilistic-generative-model recommender-system python r

lcars's Introduction

LCARS: A Spatial Item Recommender System

This repository contains my implementation of LCARS (Location-Content-Aware Recommender System) for spatial item recommendation in R and Python. Please refer to the following paper for details.

Paper: LCARS: A Spatial Item Recommender System (Yin et al. 2014)

Foursquare dataset is too large, so I'll have to truncate it before uploading. Only the truncated version of DoubanEvent data is uploaded to the data folder.

Problems

  1. Cold-start users and cities: How can we recommend spatial items to a new user traveling to a new city?
  2. User checkin histories are locally clustered, so the traditional collaborative filtering methods would produce poor recommendations.
  3. Data sparsity: User activites are very limited in distant locations.

Main Ideas

To produce a personalized spatial item recommendation, LCARS combines the querying user's interest and the local preference of the querying city, each of which is a distribution over latent topics where a latent topic is a distribution over spatial items and content words. The model assumes that items and their content words are independently conditioned on the topics. It learns topics that capture both item co-occurrence and content co-occurrence patterns.

Specifically, LCARS learns:

  • latent topic distribution over spatial items and that over content words
  • intrinsic user interest distribution over topics
  • local preference in a given region over topics

LCARS

The model folder contains 2 different folders, R and python, each of which contains the following:

  • lda.R (lda.py): Each user is a document and events (spatial items) visited by a user are the words in the document. The model ignores both location and content words of events and shows similar performance to user-based collaborative filtering. Each user is represented as a distribution over latent topics where a latent topic itself is a distribution over spatial items.
  • ca_lda.R (ca_lda.py): This is the content component of the LCA-LDA model. It only considers the content words of a spatial item and ignores the location.
  • la_lda.R: This is the location component of the LDA-LDA model. It only considers the location of each spatial item, not the content words.
  • lca_lda.R: The final LCARS model that combines ca_lda.R and la_lda.R taking into account both location and content words of each spatial item.

I'll write and upload la_lda.py and lca_lda.py once I find time.

Methods

The paper uses collapsed Gibbs sampling by only keeping track of coin flips and topic assignments for every record of a user checkin history. For the hyperparameter settings, see the paper.

First, split the data into training and test set (3:1) for every user checkin history. Fit the models above on the training set. For each user's checkin history in the test set, compute the score for events as well as all other events in a city not visited by a user.

Rank the events by sorting the scores in decreasing order. For every threshold k (see the paper), if an event in the test set is in top-k, then call that a hit. Average the scores for all test records.

Results

  1. LCA-LDA outperforms all the other models, while LDA show the worst performance. Local preference and item content information are indeed effective.
  2. For the number of latent topics > 25, increases in recommendation performance on this truncated DoubanEvent dataset is rather small. This prompts to set the number of topics to 25 to balance out the model complexity and computational efficiency.
  3. Users mainly participate in social events based on their interests, but from time to time they attend popular local events regardless of their interests. Local preference plays an important role.
  4. Each latent topic captures event co-occurrence under same event category across different locations. LCA-LDA model helps reduce sparsity in user-item ratings matrix.
  5. User interest, local preference, and content words are blended in a flexible, robust way to produce a very interpretable probabilistic model.
  6. Further research needed. Can local preference be split into that of avid travelers and non-travelers, possibly a weighted sum of those two group preferences? Does correlation exist between locations and content words i.e. correlated topic models?

lcars's People

Contributors

stathwang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

lcars's Issues

Waiting for updating

It is a quite useful code, thanks very much!
However, I'm a python user, the code wrriten by R is a little hard for me to read.
Thus, hoping the author can update the rest of the code in python quickly, wating for it!
It would be better if you could leave an e-mail address to facilitate communication.
Thanks a lot!

Waiting for updating~

It is a quite useful code, thanks very much!
However, I'm a python user, the code wrriten by R is a little hard for me to read.
Thus, hoping the author can update the rest of the code in python quickly, wating for it!
Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.