GithubHelp home page GithubHelp logo

cl-bert's Introduction

CL-Bert

计算语言学作业

本项目baseline方法及数据集划分来自项目ChID_Baseline

数据集

项目原始数据集ChID,并引入外部成语词典数据chinese-xinhua 整理后词典数据集文件data(下载后放到根目录下)

项目数据下载链接(有效期限:2024-12-31 23:59) ,需下载并解压至项目文件夹下

数据处理

  • 将数据集中待填成语数量大于1的数据都替换为多条数据
  • 同义词等增广方式

数据使用

数据使用分为两种方式

  • 成语分类:

    • 将数据集中3848条成语编号为成语词表
    • 每条数据包含:
      • senetence_token
      • sentence_mask
      • idiom_mask
      • idiom_candidate_index
      • label
  • 成语释义:

    • 将所有的成语与对应释义进行拼接,成为idiom + ':' + explanation的形式
    • 每条数据包含:
      • senetence_token
      • sentence_mask
      • idiom_mask
      • idiom_candidate_pattern_token
      • idiom_candidate_pattern_mask
      • label

模型

本项目基于在中文语料上预训练的RoBERTa模型进行设计,根据数据的使用方式不同,模型分为两类:分类模型对比模型

分类模型

  • model/baseline.py、model/ClassifyBert.py
    • 输入:senetence_tokensentence_maskidiom_maskidiom_candidate_index、[label]
    • 输出:predict
    • 损失函数:CrossEntropyLoss
    • 评价指标:Accuracy

classify model

对比模型

  • model/DualBert.py、model/ContrastiveBert.py
    • 输入:senetence_tokensentence_maskidiom_maskidiom_candidate_pattern_tokenidiom_candidate_pattern_mask、[label]
    • 输出:predict
    • 损失函数:InfoNCELoss
    • 评价指标:Accuracy

contrastive model

结合模型

综合模型将两类任务在同一个模型中进行学习。

combination model

训练

random initialized head

word classification

python train.py --model_type baseline --batch_size 24 --task_type IE --epoch 10 --warm_up_proportion 0.05

idiom classification

python train.py --model_type classify --batch_size 24 --task_type IC --epoch 10 --warm_up_proportion 0.05

idiom + cosine

python train.py --model_type dual --batch_size 24 --task_type IE --sim_mode cosine_similarity --epoch 10 --warm_up_proportion 0.05

[CLS] + cosine

python train.py --model_type dual --batch_size 24 --task_type IE --sim_mode cosine_similarity --idiom_use_cls --epoch 10 --warm_up_proportion 0.05

mask + cosine

python train.py --model_type dual --batch_size 24 --task_type IE --sim_mode cosine_similarity --idiom_use_mask --epoch 10 --warm_up_proportion 0.05

idiom + cross-attention

python train.py --model_type dual --batch_size 24 --task_type IE --sim_mode cross_attention --epoch 10 --warm_up_proportion 0.05

mask + cross-attention

python train.py --model_type dual --batch_size 24 --task_type IE --sim_mode cross_attention --idiom_use_mask --epoch 10 --warm_up_proportion 0.05

pre-trained classification head

word classification

python train.py --model_type baseline --batch_size 24 --task_type IE --epoch 10 --warm_up_proportion 0.05 --use_pretrained_generation

idiom + cosine

python train.py --model_type dual --batch_size 24 --task_type IE --sim_mode cosine_similarity --epoch 10 --warm_up_proportion 0.05 --use_pretrained_generation

[CLS] + cosine

python train.py --model_type dual --batch_size 24 --task_type IE --sim_mode cosine_similarity --idiom_use_cls --epoch 10 --warm_up_proportion 0.05 --use_pretrained_generation

mask + cosine

python train.py --model_type dual --batch_size 24 --task_type IE --sim_mode cosine_similarity --idiom_use_mask --epoch 10 --warm_up_proportion 0.05 --use_pretrained_generation

idiom + cross-attention

python train.py --model_type dual --batch_size 24 --task_type IE --sim_mode cross_attention --epoch 10 --warm_up_proportion 0.05 --use_pretrained_generation

mask + cross-attention

python train.py --model_type dual --batch_size 24 --task_type IE --sim_mode cross_attention --idiom_use_mask --epoch 10 --warm_up_proportion 0.05 --use_pretrained_generation

实验结果

img.png

cl-bert's People

Contributors

lightet avatar xx-q avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.