GithubHelp home page GithubHelp logo

shaobinchen-ah / informin-cl Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 0.0 2.32 MB

COLING 2022: An Information Minimization Contrastive Learning Model for Unsupervised Sentence Embeddings Learning.

License: Apache License 2.0

Shell 1.87% Python 98.13%

informin-cl's Introduction

An Information Minimization Based Contrastive Learning Model for Unsupervised Sentence Embeddings Learning.

This repository contains the code and pre-trained models for our paper An Information Minimization Based Contrastive Learning Model for Unsupervised Sentence Embeddings Learning.
*************************************************Updates************************************************************

  • 10/03: Some information will be updated later.
  • 9/16: We release our code.
  • 8/27: Our paper has been accepted to COLING 2022!
  • Directory

    • Overview
    • Train
      • Requirements
      • Training
    • Evaluation
    • Language Models
    • Bugs or Questions
    • Citation

    Overview

    We propose a contrastive learning model, InforMin-CL that discards the redundant information during the pre-training phase. InforMin-CL keeps important information and forgets redundant information by contrast and reconstruction operations. The following figure is an illustration of our model.
    model png

    Train

    In the following section, we describe how to train a InforMin-CL model by using our code.

    Requirements

    First, install PyTorch by following the instructions from the the official website. To faithfully reproduce our resutls, please use the correct 1.7.1 version corresponding to your platforms/CUDA versions. PyTorch version higher than 1.7.1 should also work. For example, if you use Linux and CUDA11, install PyTorch by the following command,

    conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch

    if you instead use CUDA<11 or CPU, install PyTorch by the following command,

    pip install torch==1.7.1

    Then run the following script to install the remaining dependencies,

    pip install -r requirements.txt

    Training

    python train.py
    --model_name_or_path bert-base-uncased

    Evaluation

    Our evaluation code for sentence embeddings is based on a modified version of SentEval. It evaluates sentence embeddings on unsupervised (semantic textual similarity (STS)) tasks and supervised tasks. For unsupervised tasks, our evaluation takes the "all" setting, and report Spearman's correlation.

    Before evaluation, please download the evaluation datasets by running
    cd SentEval/data/downstream/
    bash download_dataset.h

    Then come back to the root directory, you can evaluate any transformers -based pre-trained models using our evaluation code. For example,
    python evaluation.py
    --model_name_or_path informin-cl-bert-base-uncased
    --pooler cls
    --text_set sts
    --mode test

    Language Models

    Language models trained for which the performance is reported in the paper is available the Huggingface Model Repository.
    Loading the model in Python. Just replace the name of the model.
    With these models one should be able to reproduce the results on the benchmarks reported in the paper.

    Bugs or Questions?

    If you have any questions related to the code or the paper, feel free to contact with Shaobin ([email protected]). If you enconuter any problems when using the code, or want to report a bug, you can open an issue. Please try to specify the problem with details so we can give you a hand!

    Citation

    Please cite our paper if you use InforMin-CL in your work:

        @inproceedings{chen2022informin-cl,
            title={An Information Minimization Contrastive Learning Model for Unsupervised Sentence Embeddings Learning},
            author={Chen, Shaobin and Zhou, Jie and Sun, Yuling and He Liang},
            booktitle={International Conference of Computational Linguistics (COLING)},
            year={2022}}
    

    informin-cl's People

    Contributors

    shaobinchen-ah avatar

    Stargazers

     avatar  avatar  avatar Javier de la Rosa avatar  avatar

    Watchers

    Kostas Georgiou avatar  avatar

    informin-cl's Issues

    Some questions about reconstruction loss

    Hi authors,
    When discarding irrelevant information using the minimization of information entropy, it seems that the implementation of this part has been commented out. Should we uncomment the following line when running the code?

    #loss += cls.model_args.inverse_predictive_weight * inverse_predictive_loss(z1_norm, z2_norm)

    Recommend Projects

    • React photo React

      A declarative, efficient, and flexible JavaScript library for building user interfaces.

    • Vue.js photo Vue.js

      ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

    • Typescript photo Typescript

      TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

    • TensorFlow photo TensorFlow

      An Open Source Machine Learning Framework for Everyone

    • Django photo Django

      The Web framework for perfectionists with deadlines.

    • D3 photo D3

      Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

    Recommend Topics

    • javascript

      JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

    • web

      Some thing interesting about web. New door for the world.

    • server

      A server is a program made to process requests and deliver data to clients.

    • Machine learning

      Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

    • Game

      Some thing interesting about game, make everyone happy.

    Recommend Org

    • Facebook photo Facebook

      We are working to build community through open source technology. NB: members must have two-factor auth.

    • Microsoft photo Microsoft

      Open source projects and samples from Microsoft.

    • Google photo Google

      Google โค๏ธ Open Source for everyone.

    • D3 photo D3

      Data-Driven Documents codes.