GithubHelp home page GithubHelp logo

onlinelearning__framework__lerot_fork's Introduction

Lerot: an Online Learning to Rank Framework

This project is designed to run experiments on online learning to rank methods for information retrieval, implementations of predecessors of Lerot can be found here: http://ilps.science.uva.nl/resources/online-learning-framework . A paper describing Lerot can found here: http://www.anneschuth.nl/wp-content/uploads/2013/09/cikm-livinglab-2013-lerot.pdf . Below is a short summary of its prerequisites, how to run an experiment, and possible extensions.

  • Python (2.6, 2.7)
  • PyYaml
  • Numpy
  • Scipy
  • Celery (only for distributed runs)
  • Gurobi (only for OptimizedInterleave)

All prerequisites (except for Celery and Gurobi) are included in the academic distribution of Enthought Python, e.g., version 7.1.

Install the prerequisites plus Lerot as follows:

$ git clone https://bitbucket.org/ilps/lerot.git
$ cd lerot
$ pip install -r requirements.txt

In case you want to install Lerot system wide, you can do this:

$ python setup.py install
  1. prepare data in svmlight format, e.g., download the MQ2007 (see next section on Data)

    $ mkdir data
    $ wget http://research.microsoft.com/en-us/um/beijing/projects/letor/LETOR4.0/Data/MQ2007.rar -O data/MQ2007.rar
    $ (cd data && unrar x MQ2007.rar)
    
  2. prepare a configuration file in yml format, e.g., starting from the template below, store as config/experiment.yml (or simply use config/config.yml instead )

    training_queries: data/MQ2007/Fold1/train.txt
    test_queries: data/MQ2007/Fold1/test.txt
    feature_count: 46
    num_runs: 1
    num_queries: 10
    query_sampling_method: random
    output_dir: outdir
    output_prefix: Fold1
    user_model: environment.CascadeUserModel
    user_model_args:
        --p_click 0:0.0,1:0.5,2:1.0
        --p_stop 0:0.0,1:0.0,2:0.0
    system: retrieval_system.ListwiseLearningSystem
    system_args:
        --init_weights random
        --sample_weights sample_unit_sphere
        --comparison comparison.ProbabilisticInterleave
        --delta 0.1
        --alpha 0.01
        --ranker ranker.ProbabilisticRankingFunction
        --ranker_arg 3
        --ranker_tie random
    evaluation:
        - evaluation.NdcgEval
    
  3. run the experiment using python:

    $ learning-experiment.py -f config/experiment.yml
    
  4. summarize experiment outcomes:

    $ summarize-learning-experiment.py --fold_dirs outdir
    

    Arbitrarily many folds can be listed per experiment. Results are aggregated over runs and folds. The output format is a simple text file that can be further processed using e.g., gnuplot. The columns are: mean_offline_perf stddev_offline_perf mean_online_perf stddev_online_perf

Lerot acceptes data formatted in the SVMlight (see http://svmlight.joachims.org/) format. You can download learning to rank data sets here:

Note that Lerot reads from both plain text and text.gz files.

The code can easily be extended with new learning and/or feedback mechanisms for future experiments. The most obvious points for extension are:

  1. comparison - extend ComparisonMethod to add new interleaving or inference methods; existing methods include balanced interleave, team draft, and probabilistic interleave.
  2. retrieval_system - extend OnlineLearningSystem to add a new mechanism for learning from click feedback. New implementations need to be able to provide a ranked list for a given query, and ranking solutions should have the form of a vector.

If you use Lerot to produce results for your scientific publication, please refer to this paper:

@inproceedings{schuth_lerot_2013,
title = {Lerot: an Online Learning to Rank Framework},
author = {A. Schuth, K. Hofmann, S. Whiteson, M. de Rijke},
url = {http://www.anneschuth.nl/wp-content/uploads/2013/09/cikm-livinglab-2013-lerot.pdf},
year = {2013},
booktitle = {Living Labs for Information Retrieval Evaluation workshop at CIKM’13.}
}

Lerot has been used to produce results in numerous publication, including these:

    1. Hofmann, A. Schuth, S. A. Whiteson, M. de Rijke (2013): Reusing Historical Interaction Data for Faster Online Learning to Rank for IR. In: WSDM'13.
    1. Chuklin, A. Schuth, K. Hofmann, P. Serdyukov, M. de Rijke (2013): Evaluating Aggregated Search Using Interleaving. In: CIKM'14.
    1. Schuth, F. Sietsma, S. Whiteson, M. de Rijke (2014): Optimizing Base Rankers Using Clicks: A Case Study using BM25. In: ECIR’14.
    1. Hofmann, A. Schuth, A. Bellogin, M. de Rijke (2014): Effects of Position Bias on Click-Based Recommender Evaluation. In: ECIR’14.
    1. Chuklin, K. Zhou, A. Schuth, F. Sietsma, M. de Rijke (2014): Evaluating Intuitiveness of Vertical-Aware Click Models. In: Proceedings of SIGIR.
    1. Schuth, F. Sietsma, S. Whiteson, D. Lefortier M. de Rijke (2014): Multileaved Comparisons for Fast Online Evaluation. In: CIKM'14.
    1. Chuklin, A. Schuth, K. Zhou, M. de Rijke (2015): A comparative analysis of interleaving methods for aggregated search. In: ACM Transactions on Information Systems.
    1. Zoghi, S. Whiteson, M. de Rijke (2015): A Method for Large-Scale Online Ranker Evaluation. In: WSDM'15.
    1. Yiwei, K. Hofmann (2015): Online Learning to Rank: Absolute vs. Relative. In: WWW'15.
    1. Schuth et al. (2015): Probabilistic Multileave for Online Retrieval Evaluation. In: SIGIR'15.
    1. Schuth, H. Oosterhuis, S. Whiteson, M de Rijke (2016): Multileave Gradient Descent for Fast Online Learning to Rank. In: WSDM'16.
    1. Oosterhuis, A. Schuth, M. de Rijke (2016): Probabilistic Multileave Gradient Descent. In: ECIR'16.

If your paper is missing from this list, please let us know.

A paper describing Lerot is published in the living labs workshop at CIKM’13: A. Schuth, K. Hofmann, S. Whiteson, M. de Rijke (2013): Lerot: an Online Learning to Rank Framework. In: Living Labs for Information Retrieval Evaluation workshop at CIKM’13., 2013.

The following people have contributed to Lerot:

  • Katja Hofmann (Microsoft Research)
  • Anne Schuth (Blendle)
  • Harrie Oosterhuis (University of Amsterdam)
  • Jos van der Velde
  • Lars Buitinck
  • Aleksandr Chuklin
  • Floor Sietsma
  • Spyros Michaelides
  • Robert-Jan Bruintjes (University of Amsterdam)
  • David Woudenberg (University of Amsterdam)
  • Carla Groenland (University of Amsterdam)
  • Masrour Zoghi (University of Amsterdam)
  • Nikos Voskarides (University of Amsterdam)
  • Artem Grotov (University of Amsterdam)
  • Yiwei Chen
  • Saúl Vargas
  • Rolf Jagerman (University of Amsterdam)
  • Michiel van der Meer
  • Martin van Harmelen
  • Verna Dankers
  • Maarten Boon
  • Thomas Groot

If your name is missing from this list, please let us know.

The development of Lerot is partially supported by the EU FP7 project LiMoSINe (http://www.limosine-project.eu).

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.