GithubHelp home page GithubHelp logo

thu-bpm / rapl Goto Github PK

View Code? Open in Web Editor NEW
14.0 2.0 1.0 7 MB

Code and data for EMNLP 2023 paper "RAPL: A Relation-Aware Prototype Learning Approach for Few-Shot Document-Level Relation Extraction"

Home Page: https://aclanthology.org/2023.emnlp-main.316/

License: MIT License

Shell 12.30% Python 87.70%

rapl's Introduction

RAPL: Relation-Aware Prototype Learning for Few-Shot DocRE

This repository contains the data, code and trained models for paper RAPL: A Relation-Aware Prototype Learning Approach for Few-Shot Document-Level Relation Extraction.

Quick Links

Overview

In this work, we present a relation-aware prototype learning method (RAPL) for few-shot document-level relation extraction. We reframe the construction of relation prototypes into instance level and further propose a relation-weighted contrastive learning method to jointly refine the relation prototypes. We also design a task-specific NOTA prototype generation strategy to better capture the NOTA semantics in each task.

You can find more details of this work in our paper.

Setup

Install dependencies

To run the code, please install the following dependency packages:

  • apex (experiment on 0.1)
  • numpy (experiment on 1.19.4)
  • opt_einsum (experiment on 3.3.0)
  • torch (experiment on 1.9.0+cu111)
  • tqdm (experiment on 4.64.0)
  • transformers (experiment on 3.4.0)
  • wandb (experiment on 0.12.21)

Trained models

We release the sample trained models for each task setting on Tsinghua Cloud. To reproduce the results in the paper, you can download the corresponding models and place them in checkpoints directory.

Datasets

Our experiments are based on two benchmarks: FREDo and ReFREDo, and all relevant data files are located in dataset directory.

FREDo

FREDo is a few-shot document-level relation extraction benchmark consisting of two main tasks (in-domain / cross-domain) with a 1-Doc and a 3-Doc subtask each. The relevant data files include:

  • dataset/[train_docred, dev_docred, test_docred, test_scierc].json contain all annotated documents used for training and testing.
  • dataset/[test_docred_1_doc_indices, test_docred_3_doc_indices, test_scierc_1_doc_indices, test_scierc_3_doc_indices].json contain sampled episodes (only the indices of the documents and which relations are to be annotated/extracted).

ReFREDo

ReFREDo is a revised version of FREDo, which replaces the training, development and in-domain test document corpus with Re-DocRED, leading to more complete annotations. The relevant data files include:

  • dataset/[train_redocred, dev_redocred, test_redocred, test_scierc].json contain all annotated documents used for training and testing.
  • dataset/[test_redocred_1_doc_indices, test_redocred_3_doc_indices, test_scierc_1_doc_indices, test_scierc_3_doc_indices].json contain sampled episodes (only the indices of the documents and which relations are to be annotated/extracted).

Quick Start

In scripts directory, we provide the example scripts for running experiments under each task setting. For example, you can use the following command to run the trained model on in-domain 3-Doc test tasks in ReFREDo benchmark:

sh scripts/refredo_indomain_3doc.sh

You can also comment the --load_checkpoint argument and set --num_epochs argument to 25 for training. The following command can be used to display the details about each argument:

python src/main.py -h

Citation

Please kindly cite our paper if you use the data, code or models of RAPL in your work:

@inproceedings{meng-etal-2023-rapl,
    title = "{RAPL}: A Relation-Aware Prototype Learning Approach for Few-Shot Document-Level Relation Extraction",
    author = "Meng, Shiao  and
      Hu, Xuming  and
      Liu, Aiwei  and
      Li, Shuang  and
      Ma, Fukun  and
      Yang, Yawen  and
      Wen, Lijie",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    year = "2023",
    url = "https://aclanthology.org/2023.emnlp-main.316",
    doi = "10.18653/v1/2023.emnlp-main.316",
    pages = "5208--5226"
}

rapl's People

Contributors

msa30 avatar

Stargazers

 avatar Jeff Carpenter avatar  avatar phzh avatar Mingxuan Sun avatar XiangyuYang avatar 又几番迷离 avatar  avatar WangZhen avatar  avatar  avatar Xuming Hu avatar  avatar  avatar

Watchers

马福崑 avatar Kostas Georgiou avatar

Forkers

xilihutu9

rapl's Issues

RuntimeError: Error(s) in loading state_dict for Encoder:

Hi Mr. Meng:

While I was running the model by using the comment provided by scripts/fredo_crossdomain_1doc.sh.
I got the following errors:
屏幕截图 2024-01-06 172542

I guess these is something needs to be modified in the models you shared in Qinghua cloud. And I am expecting for your updates. Thank you!

PytorchstreamReader failed reading zip archive: failed finding central directory

Hi,

When running the program, I noticed a discrepancy between the file size displayed on your Tsinghua cloud checkpoint and the actual size downloaded (the website displayed 885.6 MB but the actual download was 844.6 MB). Initially, I didn't pay much attention to it, but when I attempted to execute the program, the error appears like the figure shows.

RAPL运行失败

After searching on the internet, I found that this error might be related to the incompleteness of the file being used. Therefore, I guese that the file on Tsinghua cloud might be incomplete or related to the browser I'm using. If possible, could you please share the download situation of yours and the browser you're using? Looking forward to your update.

Thank you.

Yu Chen

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.