GithubHelp home page GithubHelp logo

courseapp's Introduction

Introduction to Vector Databases and Multi-Modal Semantic Search

Thanks for takin this course, The codes we showed on the course can be seen here. To run application please follow those steps:

Introduction

Vector databases are indispensable for applications requiring similarity search, including recommendation systems, content-based image retrieval, and personalized search. Leveraging efficient indexing and searching techniques, vector databases enhance the speed and accuracy of retrieving unstructured data already represented as vectors.

Embeddings

Embedding models play a crucial role in semantic search, allowing us to represent text, image, audio, or video data numerically. These models, whether sparse or dense, compress information into high-dimensional vectors. We utilize the CLIP (Contrastive Language-Image Pretraining) model for multimodal search.

CLIP Model

Developed by OpenAI in 2021, CLIP is a Contrastive Language-Image Pretraining model trained on over 400 million images and their text representations. Its encoder handles both image and text data, making it efficient and lightweight (only 600 MB). While used for search operations in this demonstration, CLIP can also classify images similar to other pre-trained models like ResNet.

Dataset

The dataset used in this project is sourced from a huggingface image dataset. This dataset comprises images of fashion products along with their titles, color, size, etc. totaling approximately 44,000 rows. Dataset's text were embedded using sentence-transformers/all-MiniLM-L6-v1 model while images embedded using CLIP model.

Feel free to explore the provided Jupyter notebook to delve into the implementation details of multimodal search using any vector database and CLIP embeddings.

Dependencies

To run this project, follow these steps:

  1. Clone The Application:

    git clone https://github.com/UmarIgan/CourseApp.git
  2. Create Virtual Environment:

    python -m venv venv
  3. Install Dependencies:

    source venv/bin/activate  # On Windows, use 'venv\Scripts\activate'
    pip install -r requirements.txt
  4. Run the Application:

    python main.py
  5. Wait for Model Uploads:

    • After activating the virtual environment, wait for the image and text models to upload. This may take a few minutes.
  6. Wait for Dataset Download:

    • Ensure that the required dataset is downloaded. This process will also take some time.
  7. Access the Application:

    • Once the application is running, navigate to the provided localhost URL (usually http://127.0.0.1:5000/) in your web browser.

    • The application page will allow you to search for images by text or image. For text searches, the sentence-transformers/all-MiniLM-L6-v1 model is used, and for image searches, the CLIP model is employed.

Feel free to explore the provided Jupyter notebook to delve into the implementation details of multimodal search using various vector databases and CLIP embeddings.

courseapp's People

Contributors

umarigan avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.