GithubHelp home page GithubHelp logo

ir's Introduction

Tyler Kemme
CS ID: tpkemme

This information retrieval application contains a PageRankInvertedIndex, an inverted index created using the PageRank algorithm.

We also implemented pseudo relevant feedback into the current working project.  Pseudo relevant feedback is where after you've found retrievals for a certain query, you take the first n number of retrievals and include them again in a new search.  The idea is that you are taking the first n most relevant documents and including them into your search to get more relevant results.  Obviously, the number of documents you choose to re-include into the query changes the results you get.  For instance, if you do pseudo relevant feedback with only one document, the new results are most likely going to be very similar to that specific document instead of your original query.  Also, if you do pseudo relevant feedback on many documents, you results may be just as relevant as they were originally.  Generally, pseudo relevant feedback works the best when there are not a lot of documents being put into the query but each document is not exactly the same as the others.

My algorithm for pseudo relevant feedback was relatively simple to implement.  After the vsr presents the first round of retrievals, it checks to see if the pseudofeedback flag is set.  If it is, it iterates through the retrievals and adds the first n documents to the list of good documents.  After it has done this, it will obtain a new set of retrievals using the modified query and present the results immediately after the original results.


+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

This code supplies "miniature" pedagogical Java implementations of
information retrieval, spidering, and other IR and text-processing
software.  It is being released for educational and research purposes only under
the GNU General Public License (see http://www.gnu.org/copyleft/gpl.html).

It was developed for an introductory course on "Intelligent Information
Retrieval and Web Search".  See:

http://www.cs.utexas.edu/users/mooney/ir-course/ 

for more information and introductory documentation (especially see the Project
assignment descriptions).

Copyleft: Raymond J. Mooney, 2001
  

ir's People

Watchers

James Cloos avatar Tyler Kemme avatar

Forkers

yetanothertimes

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.