GithubHelp home page GithubHelp logo

chocowu / lasuie Goto Github PK

View Code? Open in Web Editor NEW
45.0 5.0 5.0 7.34 MB

Universal Information Extraction, codes for the NeurIPS-2022 paper: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model.

Home Page: http://haofei.vip/LasUIE-page/

License: Apache License 2.0

Python 100.00%
information-extraction language-modeling nlp syntax-tree universal-information-extraction

lasuie's Introduction

LasUIE: Latent Adaptive Structure-aware LM for Universal Information Extraction

pytorch 1.8.1 pytorch 1.8.1 Build Status

The pytroch implementation of the NIPS-2022 paper Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model.


🎉 Visit the project page here LasUIE


Quick Links


1. Methodology

1.1 Modeling Universal Information Extraction (UIE)

UIE has been proposed to unify all information extraction tasks in NLP community, which converts the structure prediction of IE tasks universally into the sequence prediction via generative LMs.

All IE jobs essentially revolves around predicting two key elements: <mention spans> or/and their <semantic relations>. In this project, we thus reduce all the IE tasks into three prototypes: span extraction, pair extraction and hyper-pair extraction:

  • I) Span Extraction, e.g.,

    • named entity recognition (NER)
    • aspect-based sentiment analysis (ABSA)
    • aspect-term extraction (ATE)
  • II) Pair Extraction, e.g.,

    • relation extraction (RE)
    • aspect-opinion pair extraction (AOP)
    • aspect-based sentiment triplet extraction (ASTE)
  • III) Hyper-pair Extraction, e.g.,

    • event extraction (EE)
    • semantic role labeling (SRL)
    • opinion role labeling (ORL)

Under this scheme, mention spans are described with <Span> terms and the corresponding <Span Attribute> labels; semantic relations are straightforwardly denoted with <Relation> labels.

And all the IE structures are cast into a sequential representation: Linearized Hierarchical Expression (LHE). For example,

  • in span extraction:

    • { ( Span1 , Attr1 ) , ... , ( Spani , Attri ) , ... }
  • in span extraction:

    • { ... , ( Spani , Attri [ Relk ] Spanj , Attrj ) , ... }
  • in span extraction:

    • { ... , ( Spani , Attri [ Relk ] Spanj , Attrj [ Relm ] Spann , Attrn , ... ) , ... }

1.2 UIE with Structure-aware Generative Language Model

As cast above, UIE has two key common challenges of IEs:

  • Boundary Identification of each span terms (for UIE-element-II: span extraction).

  • Long-range Dependence between different span terms (for UIE-element-I: relation extraction);

We thus propose addressing the two challenges by modeling both the syntactic dependency structure and constituency structure, where the constituency syntax mostly benefits the first challenge; the dependency structure well aids the second challenge. To implement the above idea, we propose learning a Latent Adaptive Structure-aware Generative Language Model for UIE, aka, LasUIE.

LasUIE has a three-stage learning procedure:

  • Stage-I: unsupervised generic pre-training:

    • generally using an off-the-shelf well-trained generative LM (GLM), e.g., BART, T5.
  • Stage-II: unsupervised structure-aware post-training:

    • a newly introduced procedure in this project, inserted between the pre-training and fine-tuning stages for structure learning.
  • Stage-III: supervised task-oriented structure fine-tuning:

    • a newly introduced procedure in this project, along with the task-specific finetuning.

1.2.1 Unsupervised structure-aware post-training

A Heterogeneous structure inductor (HSI) module is used to unsupervisedly enrich the backbone GLM with sufficient structural knowledge, reinforcing the awareness of linguistic syntax.

1.2.2 Supervised task-oriented structure fine-tuning

Further adjusting (finetune) the syntactic attributes within the GLM with stochastic policy gradient algorithm by directly taking the feedback of end task performance, such that the learned structural features are most coincident with the end task needs.


2. Code Usage

2.1 Requirement Installation

  • Step 1: install base envir

    conda create -n lasuie python=3.8
  • Step 2: install pytorch

     # CUDA 10.2
     conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=10.2 -c pytorch
    
     # CUDA 11.3
     conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
  • Step 3: install other requirements

    pip install -r requirements.txt

2.2 Code Structure

│---------------------------------------------------
├─config                           // configuration fold
│    ├─config.json                 // config for generic finetune
│    └─config_struct_tune.json     // config for structural finetune
│
├─data                             // data fold
│  ├─hyperpair                     // dataset for hyperpair extraction 
│  │  └─orl                        // task name
│  │      └─mpqa                   // dataset name
│  │          ├─labels.json        // template labels for hyperpair extraction 
│  │          ├─dev.json           // template dev set for hyperpair extraction 
│  │          ├─test.json          // template test set for hyperpair extraction 
│  │          └─train.json         // template train set for hyperpair extraction 
│  │  
│  ├─pair                           // dataset for pair extraction 
│  │  └─re  
│  │      └─nyt  
│  │          └─...
│  │  
│  ├─span                          // dataset for span extraction  
│  │  └─ner  
│  │      └─conll03  
│  │           └─...
│  │  
│  └─post-training                 // corpos for post-training of the GLM
│      ├─books-corpus  
│      └─wikipedia-en  
│---------------------------------------------------
├─checkpoint                       // saving model checkpoints
│    └─...
├─logs                             // saving experiment logs
│    └─...
├─test_output                      // saving testing/inference outputs
│    └─...
├─figures                          
├─requirements.txt                 
├─README.md             
├─LICENSE  
│---------------------------------------------------
├─engine                           // core codes here 
│    ├─constants.py
│    ├─cus_argument.py
│    ├─data_utils.py
│    ├─evaluating.py
│    ├─module.py
│    ├─t5_modeling.py
│    └─utils.py
│
├─run_struct_post_train.py          // entry of second phase of structural post-training
├─run_finetune.py                   // entry of thrid phase of generic fine-tuning
├─run_finetune_with_struct_tune.py  // entry of thrid phase of structural fine-tuning
├─run_inference.py                  // entry of fourth phase of inference 
└---------------------------------------------------

2.3 Running Pipeline

The general pipeline goes as:

Step 1         run_struct_post_train.py  
                          ↓ 
Step 2            run_finetune.py (first train, then eval)
                          ↓
Step 3       run_finetune_with_struct_tune.py
                          ↓          
Step 4             run_inference.py

2.3.1 Structure-aware post-training

2.3.2 Supervised fine-tuning

A. task-oriented fine-tuning

  • Choosing to use ModelType.UIE or ModelType.LASUIE (in engine.constants.py) as the model type. ModelType.LASUIE model is much time-consuming than ModelType.UIE.

  • Configurate correctly all the arguments in run_finetune.py#init_args() and the config.json file.

  • Run starting finetune

    python run_finetune.py

B. structure fine-tuning

  • Choosing ModelType.LASUIE_STRUCT_TUNING (in engine.constants.py) as the backbone model.

  • Configurate config_struct_tune.json

  • Run starting structure-finetune

    python run_finetune_with_struct_tune.py
  • Notes: runing run_finetune_with_struct_tune.py is time-consuming.

    • structural fine-tuning is optional, can use the generic fine-tuning (run_finetune.py) instead.
    • recommended GPU requirement: >2* A100 (80G) GPUs.
  • Notes: making sure B. structural-tuning happens after A. generic fine-tuning, because hard-start structural-tuning leads to non-convergence.

2.3.3 Inference

  • Configurate correctly the argument model_checkpoint with the well-trained model.

  • Run starting inference

    python run_inference.py
  • The outputs of predictions will be converted to the UIE structures, and be saved in test_output fold.

2.4 Dataset & Evaluating

2.4.1 Dataset

  • Prepare your own data in the template format of data/hyperpair, data/pair or data/span.

  • Configurate the config/config.json and config/config_struct_tune.json before runing the scripts.

2.4.1 Evaluating

  • During training (tuning), the monitoring metric is rouge, as it is a text generation process.

    • Only enable F1 metric monitoring when model produce stable predictions.
  • The evaluation is based on exact match of spans and triplets, feel free to customize the evaluation metrics in engine/evaluating.py.


3 MISC

3.1 Citation

If you use this work or code, please kindly cite:

@inproceedings{fei2022lasuie,
  author = {Fei, Hao and Wu, Shengqiong and Li, Jingye and Li, Bobo and Li, Fei and Qin, Libo and Zhang, Meishan and Zhang, Min and Chua, Tat-Seng},
  booktitle = {Advances in Neural Information Processing Systems},
  title = {LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model},
  url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/63943ee9fe347f3d95892cf87d9a42e6-Paper-Conference.pdf},
  pages = {15460--15475},
  year = {2022}
}

3.2 Acknowledgement

This code is partially referred from following projects or papers: UIE; Structformer, Huggingface-T5.

3.3 License

The code is released under Apache License 2.0 for Noncommercial use only. Any commercial use should get formal permission first from authors.

3.4 Contact

For any question or issue, please contact @Hao Fei and @Shengqiong Wu.

lasuie's People

Contributors

chocowu avatar scofield7419 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

lasuie's Issues

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

When I ran the ner task in multiple GPUs, the program crashed and logging the error messages as the title like. I try to fix the bug and I just find the call of grad_fn function in lines 567 in the run_struct_post_train.py. But when I add the code state.params.requires_grad_(True) orloss.requires_grad_(True) , the program didn't work as well.
Hope you can give me some guidances,thanks!
The error messages are as follows:
image

语料库问题

在构建语料库的时候我根据论文中给出的链接想要下载wikipedia,但是进入了如下页面中,在点击下载后转到了github中,下载完文件中只有一个txt文件,在运行的时候会提示我FileNotFoundError: Unable to find '/home/machenghao/LasUIE-master/data/post-training/wikipedia-en/dev.txt',应该是语料库中的东西不全,请问这个是我下载的问题吗,我该怎么获得完整的语料库。希望能得到您的解答,万分感谢!

image

hello, I have met this error.Does i lack of anything about the pretrained model?

Traceback (most recent call last):
File "/home/LasUIE/run_finetune.py", line 289, in
model_wrapper = ModelWrapper(**config)
File "/home/LasUIE/run_finetune.py", line 46, in init
self.model = get_model(self.model_type, self.lm_location)
File "/home/LasUIE/run_finetune.py", line 35, in get_model
return model_dict[model_type].from_pretrained(lm_location)
File "/home/anaconda3/envs/lasuie/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2795, in from_pretrained
) = cls._load_pretrained_model(
File "/home/anaconda3/envs/lasuie/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3008, in _load_pretrained_model
raise ValueError(
ValueError: The state dictionary of the model you are trying to load is corrupted. Are you sure it was properly saved?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.