GithubHelp home page GithubHelp logo

zohaibterminator / boolean-retrieval-model Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 2.12 MB

Assignment 1 of Information Retrieval Course

License: MIT License

Python 100.00%
boolean-retrieval-model custom-tkinter information-retrieval inverted-indexing positional-indexing tkinter python

boolean-retrieval-model's Introduction

Boolean Retrieval Model

This project implements a basic information retrieval system capable of processing boolean and proximity queries. The system utilizes inverted and positional indexes to efficiently retrieve relevant document IDs based on user queries.

Features

  • Supports boolean queries including AND, OR, and NOT operations.
  • Implements proximity queries to find terms within a specified distance of each other.
  • Utilizes inverted and positional indexes for efficient query processing.
  • Provides a simple GUI interface for user interaction.

Getting Started

To run the information retrieval system, follow these steps:

  • Ensure you have Python 3.12 installed.
  • Install NLTK, tkinter and customtkinter library using pip install NLTK etc.
  • Make sure Stopword-List.txt and the Research Paper directory containing all the documents is in your current working directory.
  • Run the files in an IDE.
  • Run this command to download the tokennizer: nltk.download('punkt')
  • Run the index_creation.py script first using python index_creation.py to create and save the indexes.
  • Then run the query_processing.py using python query_processing.py for queries.
  • Use the tkinter GUI interface to input queries and press 'Process Query' button to retrieve the required document IDs.
  • Press the 'Exit' button to exit the program.

Usage

Boolean Queries

Boolean queries can be constructed using AND, OR, and NOT operations. Simply enter your query in the provided text box and click the "Process Query" button to retrieve relevant documents.

Proximity Queries

Proximity queries allow users to find terms within a specified distance of each other. Enter your query in the format 'term1 term2 /distance' and click "Process Query" to retrieve relevant documents.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • This project was inspired by information retrieval concepts and algorithms.
  • Special thanks to the developers of NLTK for providing essential natural language processing tools.
  • Special thanks to Tom Schimansky for developing CustomTkinter that was used for making the GUI.

boolean-retrieval-model's People

Contributors

zohaibterminator avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.