Term: Spring 2018
- Team #9
- Projec title: Collaborative Filtering
- Team members
- Chen, Ziyu [email protected]
- Kang, Yuhao [email protected]
- Lin, Yanjun [email protected]
- Liu, Fangbing [email protected]
- Project summary: In this project, we used memory-based algorithm and model-based algorithm to do collaborative filtering. For memory-based algorithm, we use best-n estimator to select neighbours and use combination of different similarities: Pearson correlation, Mean-square-difference, Simrank similarity(only for EachMovie data) and variance weight to do collaborative filtering. We use cluster model (only for Microsoft Web Data) as model-based algorithm. For evaluation, we use ranked scoring to evaluate Microsoft Web Data and use MAE and ROC_4 to evaluate EachMovie data.
The following shows our result:
MS Data:
Movie Data:
- Presentation slides can be found here: Googld Slide
Contribution statement: (default)
- Chen, Ziyu: Built the regular SimRank model and the N/P SimRank model. Organized the main.rmd file.
- Kang, Yuhao: Calculated variance weighting. Created selecting neighbours (Best-n estimator) and prediction. Organized the main.rmd file.
- Lin, Yanjun:Processed orginal data. Built Cluster Model (EM algorithm). Created rank score evaluation. Organized main.rmd file. Prepared the presentation. Combined variance wighting with MSD and Correlation, and improved code efficiency. Improved SimRank by seperating the data into positive and negative connections.
- Liu, Fangbing:Processed original data. Calculated similarity weight (Pearson Correlation and Mean-Square-Difference). Created MAE and ROC evaluation. Organized the main.rmd file. Wrote the summary page on github.
Following suggestions by RICH FITZJOHN (@richfitz). This folder is orgarnized as follows.
proj/
├── lib/
├── data/
├── doc/
├── figs/
└── output/
Please see each subfolder for a README file.