GithubHelp home page GithubHelp logo

align-macridvae's Introduction

This repo contains the code for the implementation of Align MacridVAE. If you want to learn more about the paper you can check the article presented at ECIR 2024. This implements a multimodal recommender that can suggest items to users based on their preferences

The project is implemented in Torch and implements a shallow Variational Autoencoder with a pre-training step to Align image and textual representation.

Requirements

About software, you will need the following tools:

  • Python 3.10 or later
  • CUDA 12.1 or later, if you plan to train this on a GPU

Regarding harwate, this project is meant to run in NVIDIA GPUs, like the ones personal laptops, or in datacenters. It can also run on the CPU but it will be much slower. We tested it in V100, A100 Series and RTX 20 series. The model is relatively simple and small and we don't load larger models like CLIP, BERT or ViT during training or inference. Items are preprocessed before running through the model to simplify training.

Installation

First, install the requirements.txt file which specifies the dependencies

pip install -r requirements.txt

Next fetch the datasets. The datasets are hosted in Kaggle here and it is available to download through the web UI or using the command line tools. For example, if you already have set up your Kaggle credentials.

# Optional, you can download the dataset through the website
kaggle datasets download ignacioavas/alignmacrid-vae

unzip alignmacrid-vae.zip -d RecomData/
rm alignmacrid-vae.zip

The dataset contains data from subcategories Amazon Dataset, Movielens 25M, Bookcrossing.. Those datasets were prepared by adding images and filtering missing items, and then passing textual and visual representation through and encoders like BERT, CLIP or ViT. You can learn more by reading the README.md in the dataset root directory. The preprocessing code for building the datasets is available at Align-MacridVAE-data,

Running

Once you have the datasets downloaded, you can train a model by running the main.py script with the train argument. For example, to train the Amazon Musical Instruments dataset encoded with CLIP for visual and textual modality run the following command:

python main.py train --data Musical_Instruments-clip_clip

The training code will generate a file in the run/ directory with a name depending on the dataset and the model parameters, for example: Musical_Instruments-clip_clip-AlignMacridVAE-50E-100B-0.001L-0.0001W-0.5D-0.2b-7k-200d-0.1t-98765s. The model.pkl file contains the trained model.

Run python main.py --help to see all available parameters.

Evaluating

To evaluate a given model, we can pass the test mode. It will try to load a model from the run directory provided it was already trained. For example, to evaluate the same model as above run the following command:

python main.py test --data Musical_Instruments-clip_clip

align-macridvae's People

Contributors

igui avatar zhouyw16 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.