Lerot: an Online Learning to Rank Framework

This project is designed to run experiments on online learning to rank methods for information retrieval, implementations of predecessors of Lerot can be found here: http://ilps.science.uva.nl/resources/online-learning-framework . A paper describing Lerot can found here: http://www.anneschuth.nl/wp-content/uploads/2013/09/cikm-livinglab-2013-lerot.pdf . Below is a short summary of its prerequisites, how to run an experiment, and possible extensions.

Table of Contents

Prerequisites
Installation
Running experiments
Data
Extensions
Citation
Publications
Contributors
Acknowledgements
License

Prerequisites

Python (2.6, 2.7)
PyYaml
Numpy
Scipy
Celery (only for distributed runs)
Gurobi (only for OptimizedInterleave)

All prerequisites (except for Celery and Gurobi) are included in the academic distribution of Enthought Python, e.g., version 7.1.

Installation

Install the prerequisites plus Lerot as follows:

$ git clone https://bitbucket.org/ilps/lerot.git
$ cd lerot
$ pip install -r requirements.txt

In case you want to install Lerot system wide, you can do this:

$ python setup.py install

Running experiments

prepare data in svmlight format, e.g., download the MQ2007 (see next section on Data)

$ mkdir data
$ wget http://research.microsoft.com/en-us/um/beijing/projects/letor/LETOR4.0/Data/MQ2007.rar -O data/MQ2007.rar
$ (cd data && unrar x MQ2007.rar)

prepare a configuration file in yml format, e.g., starting from the template below, store as config/experiment.yml (or simply use config/config.yml instead )

training_queries: data/MQ2007/Fold1/train.txt
test_queries: data/MQ2007/Fold1/test.txt
feature_count: 46
num_runs: 1
num_queries: 10
query_sampling_method: random
output_dir: outdir
output_prefix: Fold1
user_model: environment.CascadeUserModel
user_model_args:
    --p_click 0:0.0,1:0.5,2:1.0
    --p_stop 0:0.0,1:0.0,2:0.0
system: retrieval_system.ListwiseLearningSystem
system_args:
    --init_weights random
    --sample_weights sample_unit_sphere
    --comparison comparison.ProbabilisticInterleave
    --delta 0.1
    --alpha 0.01
    --ranker ranker.ProbabilisticRankingFunction
    --ranker_arg 3
    --ranker_tie random
evaluation:
    - evaluation.NdcgEval

run the experiment using python:

$ learning-experiment.py -f config/experiment.yml

summarize experiment outcomes:
```
$ summarize-learning-experiment.py --fold_dirs outdir
```
Arbitrarily many folds can be listed per experiment. Results are aggregated over runs and folds. The output format is a simple text file that can be further processed using e.g., gnuplot. The columns are: mean_offline_perf stddev_offline_perf mean_online_perf stddev_online_perf

Data

Lerot acceptes data formatted in the SVMlight (see http://svmlight.joachims.org/) format. You can download learning to rank data sets here:

GOV: http://research.microsoft.com/en-us/um/beijing/projects/letor/LETOR3.0/Gov.rar (you'll need files in QueryLevelNorm)
OHSUMED: http://research.microsoft.com/en-us/um/beijing/projects/letor/LETOR3.0/OHSUMED.zip
MQ2007: http://research.microsoft.com/en-us/um/beijing/projects/letor/LETOR4.0/Data/MQ2007.rar (files for supervised learning)
MQ2008: http://research.microsoft.com/en-us/um/beijing/projects/letor/LETOR4.0/Data/MQ2008.rar (files for supervised learning)
Yahoo!: http://webscope.sandbox.yahoo.com/catalog.php?datatype=c
MSLR-WEB10K: http://research.microsoft.com/en-us/um/beijing/projects/mslr/data/MSLR-WEB10K.zip
MSLR-WEB30K: http://research.microsoft.com/en-us/um/beijing/projects/mslr/data/MSLR-WEB30K.zip
Yandex Internet Mathematics 2009: http://imat2009.yandex.ru/academic/mathematic/2009/en/datasets (query identifier need to be parsed out of comment into qid feature)

Note that Lerot reads from both plain text and text.gz files.

Extensions

The code can easily be extended with new learning and/or feedback mechanisms for future experiments. The most obvious points for extension are:

comparison - extend ComparisonMethod to add new interleaving or inference methods; existing methods include balanced interleave, team draft, and probabilistic interleave.
retrieval_system - extend OnlineLearningSystem to add a new mechanism for learning from click feedback. New implementations need to be able to provide a ranked list for a given query, and ranking solutions should have the form of a vector.

Citation

If you use Lerot to produce results for your scientific publication, please refer to this paper:

@inproceedings{schuth_lerot_2013,
title = {Lerot: an Online Learning to Rank Framework},
author = {A. Schuth, K. Hofmann, S. Whiteson, M. de Rijke},
url = {http://www.anneschuth.nl/wp-content/uploads/2013/09/cikm-livinglab-2013-lerot.pdf},
year = {2013},
booktitle = {Living Labs for Information Retrieval Evaluation workshop at CIKM’13.}
}

Publications

Lerot has been used to produce results in numerous publication, including these:

1. Hofmann, A. Schuth, S. A. Whiteson, M. de Rijke (2013): Reusing Historical Interaction Data for Faster Online Learning to Rank for IR. In: WSDM'13.
1. Chuklin, A. Schuth, K. Hofmann, P. Serdyukov, M. de Rijke (2013): Evaluating Aggregated Search Using Interleaving. In: CIKM'14.
1. Schuth, F. Sietsma, S. Whiteson, M. de Rijke (2014): Optimizing Base Rankers Using Clicks: A Case Study using BM25. In: ECIR’14.
1. Hofmann, A. Schuth, A. Bellogin, M. de Rijke (2014): Eﬀects of Position Bias on Click-Based Recommender Evaluation. In: ECIR’14.
1. Chuklin, K. Zhou, A. Schuth, F. Sietsma, M. de Rijke (2014): Evaluating Intuitiveness of Vertical-Aware Click Models. In: Proceedings of SIGIR.
1. Schuth, F. Sietsma, S. Whiteson, D. Lefortier M. de Rijke (2014): Multileaved Comparisons for Fast Online Evaluation. In: CIKM'14.
1. Chuklin, A. Schuth, K. Zhou, M. de Rijke (2015): A comparative analysis of interleaving methods for aggregated search. In: ACM Transactions on Information Systems.
1. Zoghi, S. Whiteson, M. de Rijke (2015): A Method for Large-Scale Online Ranker Evaluation. In: WSDM'15.
1. Yiwei, K. Hofmann (2015): Online Learning to Rank: Absolute vs. Relative. In: WWW'15.
1. Schuth et al. (2015): Probabilistic Multileave for Online Retrieval Evaluation. In: SIGIR'15.
1. Schuth, H. Oosterhuis, S. Whiteson, M de Rijke (2016): Multileave Gradient Descent for Fast Online Learning to Rank. In: WSDM'16.
1. Oosterhuis, A. Schuth, M. de Rijke (2016): Probabilistic Multileave Gradient Descent. In: ECIR'16.

If your paper is missing from this list, please let us know.

A paper describing Lerot is published in the living labs workshop at CIKM’13: A. Schuth, K. Hofmann, S. Whiteson, M. de Rijke (2013): Lerot: an Online Learning to Rank Framework. In: Living Labs for Information Retrieval Evaluation workshop at CIKM’13., 2013.

Contributors

The following people have contributed to Lerot:

Katja Hofmann (Microsoft Research)
Anne Schuth (Blendle)
Harrie Oosterhuis (University of Amsterdam)
Jos van der Velde
Lars Buitinck
Aleksandr Chuklin
Floor Sietsma
Spyros Michaelides
Robert-Jan Bruintjes (University of Amsterdam)
David Woudenberg (University of Amsterdam)
Carla Groenland (University of Amsterdam)
Masrour Zoghi (University of Amsterdam)
Nikos Voskarides (University of Amsterdam)
Artem Grotov (University of Amsterdam)
Yiwei Chen
Saúl Vargas
Rolf Jagerman (University of Amsterdam)
Michiel van der Meer
Martin van Harmelen
Verna Dankers
Maarten Boon
Thomas Groot

If your name is missing from this list, please let us know.

Acknowledgements

The development of Lerot is partially supported by the EU FP7 project LiMoSINe (http://www.limosine-project.eu).

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program. If not, see http://www.gnu.org/licenses/.

zhfzhmsra / onlinelearningframeworklerot_fork Goto Github PK

onlinelearningframeworklerot_fork's Introduction

Lerot: an Online Learning to Rank Framework

Prerequisites

Installation

Running experiments

Data

Extensions

Citation

Publications

Contributors

Acknowledgements

License

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs

zhfzhmsra / onlinelearning__framework__lerot_fork Goto Github PK

onlinelearning__framework__lerot_fork's Introduction

Lerot: an Online Learning to Rank Framework

Recommend Projects

Recommend Topics

Recommend Org

Jobs

zhfzhmsra / onlinelearningframeworklerot_fork Goto Github PK

onlinelearningframeworklerot_fork's Introduction