GithubHelp home page GithubHelp logo

zperzendetta / meme_search Goto Github PK

View Code? Open in Web Editor NEW

This project forked from neonwatty/meme_search

0.0 0.0 0.0 24.22 MB

Index your memes by their content and text, making them easily retrievable for your meme warfare pleasures.

Python 0.47% CSS 0.01% Jupyter Notebook 99.51% Dockerfile 0.01%

meme_search's Introduction

Open In Colab Youtube Python application

Meme Search app, walkthrough, and demo

Use Python and AI to index your memes by their content and text, making them easily retrievable for your meme warfare pleasures.

A table of contents for the remainder of this README:

Introduction

This repository contains code, a walkthrough notebook (meme_search_walkthrough.ipynb), and streamlit demo app for indexing, searching, and easily retrieving your memes based on semantic search of their content and text.

All processing - from image-to-text extraction, to vector embedding, to search - is performed locally.

Pipeline overview

This meme search pipeline is built using the following open source components:

  • moondream: a tiny, kickass vision language model used for image captioning / extracting image text
  • all-MiniLM-L6-v2: a very popular text embedding model
  • faiss: a fast and efficient vector db
  • sqlite: the greatest database of all time, used for data indexing
  • streamlit: for serving up the app

Installation instructions

To create a handy tool for your own memes pull the repo and install the requirements file

pip install -r requirements.txt

Note that the particular pinned requirements here are necessary to avoid a current nasty segmentation fault involving sentence-transformers as of 6/5/2024.

Alternatively you can install all the requirements you need using docker via the compose file found in the repo. The command to install the above requirements and start the server using docker-compose is

docker compose up

Start the streamlit server

After indexing your memes you can then start the streamlit app, allowing you to semantically search for and retrieve your memes

python -m streamlit run meme_search/app.py

To start the app via docker-compose use

docker compose up

Note: you can drag and drop any recovered meme directly from the streamlit app to any messager app of your choice.

Index your own memes

Place any images / memes you would like indexed for the search app in this repo's subdirectory

data/input/

You can clear out the default test images in this location first, or leave them.

Next - at your terminal - paste the following command

python meme_search/utilities/create.py

or if running the server via docker us

docker exec meme_search python meme_search/utilities/create.py

You will see printouts at the terminal indicating success of the 3 main stages for making your memes searchable. These steps are

  1. extract: get text descriptions of each image, including ocr of any text on the image, using the kickass tiny vision-llm moondream

  2. embed: window and embed each image's text description using a popular embedding model - sentence-transformers/all-MiniLM-L6-v2

  3. index: index the embeddings in an open source and local vector base faiss database and references connecting the embeddings to their images in the greatest little db of all time - sqlite

meme_search's People

Contributors

neonwatty avatar thijsvanloef avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.