GithubHelp home page GithubHelp logo

alibaba / easynlp Goto Github PK

View Code? Open in Web Editor NEW
1.9K 37.0 246.0 20.38 MB

EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit

License: Apache License 2.0

Python 57.65% Shell 0.20% Jupyter Notebook 42.15%
transformers bert nlp pretrained-models deep-learning pytorch fewshot-learning knowledge-distillation knowledge-pretraining text-image-retrieval

easynlp's Introduction



EasyNLP is a Comprehensive and Easy-to-use NLP Toolkit

website online Open in PAI-DSW open issues GitHub pull-requests GitHub latest commit PRs Welcome

EasyNLP 中文介绍

EasyNLP is an easy-to-use NLP development and application toolkit in PyTorch, first released inside Alibaba in 2021. It is built with scalable distributed training strategies and supports a comprehensive suite of NLP algorithms for various NLP applications. EasyNLP integrates knowledge distillation and few-shot learning for landing large pre-trained models, together with various popular multi-modality pre-trained models. It provides a unified framework of model training, inference, and deployment for real-world applications. It has powered more than 10 BUs and more than 20 business scenarios within the Alibaba group. It is seamlessly integrated to Platform of AI (PAI) products, including PAI-DSW for development, PAI-DLC for cloud-native training, PAI-EAS for serving, and PAI-Designer for zero-code model training.

Main Features

  • Easy to use and highly customizable: In addition to providing easy-to-use and concise commands to call cutting-edge models, it also abstracts certain custom modules such as AppZoo and ModelZoo to make it easy to build NLP applications. It is equipped with the PAI PyTorch distributed training framework TorchAccelerator to speed up distributed training.
  • Compatible with open-source libraries: EasyNLP has APIs to support the training of models from Huggingface/Transformers with the PAI distributed framework. It also supports the pre-trained models in EasyTransfer ModelZoo.
  • Knowledge-injected pre-training: The PAI team has a lot of research on knowledge-injected pre-training, and builds a knowledge-injected model that wins first place in the CCF knowledge pre-training competition. EasyNLP integrates these cutting-edge knowledge pre-trained models, including DKPLM and KGBERT.
  • Landing large pre-trained models: EasyNLP provides few-shot learning capabilities, allowing users to finetune large models with only a few samples to achieve good results. At the same time, it provides knowledge distillation functions to help quickly distill large models to a small and efficient model to facilitate online deployment.
  • Multi-modality pre-trained models: EasyNLP is not about NLP only. It also supports various popular multi-modality pre-trained models to support vision-language tasks that require visual knowledge. For example, it is equipped with CLIP-style models for text-image matching and DALLE-style models for text-to-image generation.

Technical Articles

We have a series of technical articles on the functionalities of EasyNLP.

Installation

You can setup from the source:

$ git clone https://github.com/alibaba/EasyNLP.git
$ cd EasyNLP
$ python setup.py install

This repo is tested on Python 3.6, PyTorch >= 1.8.

Quick Start

Now let's show how to use just a few lines of code to build a text classification model based on BERT.

from easynlp.appzoo import ClassificationDataset
from easynlp.appzoo import get_application_model, get_application_evaluator
from easynlp.core import Trainer
from easynlp.utils import initialize_easynlp, get_args
from easynlp.utils.global_vars import parse_user_defined_parameters
from easynlp.utils import get_pretrain_model_path

initialize_easynlp()
args = get_args()
user_defined_parameters = parse_user_defined_parameters(args.user_defined_parameters)
pretrained_model_name_or_path = get_pretrain_model_path(user_defined_parameters.get('pretrain_model_name_or_path', None))

train_dataset = ClassificationDataset(
    pretrained_model_name_or_path=pretrained_model_name_or_path,
    data_file=args.tables.split(",")[0],
    max_seq_length=args.sequence_length,
    input_schema=args.input_schema,
    first_sequence=args.first_sequence,
    second_sequence=args.second_sequence,
    label_name=args.label_name,
    label_enumerate_values=args.label_enumerate_values,
    user_defined_parameters=user_defined_parameters,
    is_training=True)

valid_dataset = ClassificationDataset(
    pretrained_model_name_or_path=pretrained_model_name_or_path,
    data_file=args.tables.split(",")[-1],
    max_seq_length=args.sequence_length,
    input_schema=args.input_schema,
    first_sequence=args.first_sequence,
    second_sequence=args.second_sequence,
    label_name=args.label_name,
    label_enumerate_values=args.label_enumerate_values,
    user_defined_parameters=user_defined_parameters,
    is_training=False)

model = get_application_model(app_name=args.app_name,
    pretrained_model_name_or_path=pretrained_model_name_or_path,
    num_labels=len(valid_dataset.label_enumerate_values),
    user_defined_parameters=user_defined_parameters)

trainer = Trainer(model=model, train_dataset=train_dataset,user_defined_parameters=user_defined_parameters,
    evaluator=get_application_evaluator(app_name=args.app_name, valid_dataset=valid_dataset,user_defined_parameters=user_defined_parameters,
    eval_batch_size=args.micro_batch_size))
    
trainer.train()

The complete example can be found here.

You can also use AppZoo Command Line Tools to quickly train an App model. Take text classification on SST-2 dataset as an example. First you can download the train.tsv, and dev.tsv, then start training:

$ easynlp \
   --mode=train \
   --worker_gpu=1 \
   --tables=train.tsv,dev.tsv \
   --input_schema=label:str:1,sid1:str:1,sid2:str:1,sent1:str:1,sent2:str:1 \
   --first_sequence=sent1 \
   --label_name=label \
   --label_enumerate_values=0,1 \
   --checkpoint_dir=./classification_model \
   --epoch_num=1  \
   --sequence_length=128 \
   --app_name=text_classify \
   --user_defined_parameters='pretrain_model_name_or_path=bert-small-uncased'

And then predict:

$ easynlp \
  --mode=predict \
  --tables=dev.tsv \
  --outputs=dev.pred.tsv \
  --input_schema=label:str:1,sid1:str:1,sid2:str:1,sent1:str:1,sent2:str:1 \
  --output_schema=predictions,probabilities,logits,output \
  --append_cols=label \
  --first_sequence=sent1 \
  --checkpoint_path=./classification_model \
  --app_name=text_classify

To learn more about the usage of AppZoo, please refer to our documentation.

ModelZoo

EasyNLP currently provides the following models in ModelZoo:

  1. PAI-BERT-zh (from Alibaba PAI): pre-trained BERT models with a large Chinese corpus.
  2. DKPLM (from Alibaba PAI): released with the paper DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for Natural Language Understanding by Taolin Zhang, Chengyu Wang, Nan Hu, Minghui Qiu, Chengguang Tang, Xiaofeng He and Jun Huang.
  3. KGBERT (from Alibaba Damo Academy & PAI): pre-train BERT models with knowledge graph embeddings injected.
  4. BERT (from Google): released with the paper BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.
  5. RoBERTa (from Facebook): released with the paper RoBERTa: A Robustly Optimized BERT Pretraining Approach by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer and Veselin Stoyanov.
  6. Chinese RoBERTa (from HFL): the Chinese version of RoBERTa.
  7. MacBERT (from HFL): released with the paper Revisiting Pre-trained Models for Chinese Natural Language Processing by Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang and Guoping Hu.
  8. WOBERT (from ZhuiyiTechnology): the word-based BERT for the Chinese language.
  9. FashionBERT (from Alibaba PAI & ICBU): in progress.
  10. GEEP (from Alibaba PAI): in progress.
  11. Mengzi (from Langboat): released with the paper Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese by Zhuosheng Zhang, Hanqing Zhang, Keming Chen, Yuhang Guo, Jingyun Hua, Yulong Wang and Ming Zhou.
  12. Erlangshen (from IDEA): released from the repo.

Please refer to this readme for the usage of these models in EasyNLP. Meanwhile, EasyNLP supports to load pretrained models from Huggingface/Transformers, please refer to this tutorial for details.

EasyNLP Goes Multi-modal

EasyNLP also supports various popular multi-modality pre-trained models to support vision-language tasks that require visual knowledge. For example, it is equipped with CLIP-style models for text-image matching and DALLE-style models for text-to-image generation.

  1. Text-image Matching
  2. Text-to-image Generation
  3. Image-to-text Generation

Landing Large Pre-trained Models

EasyNLP provide few-shot learning and knowledge distillation to help land large pre-trained models.

  1. PET (from LMU Munich and Sulzer GmbH): released with the paper Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference by Timo Schick and Hinrich Schutze. We have made some slight modifications to make the algorithm suitable for the Chinese language.
  2. P-Tuning (from Tsinghua University, Beijing Academy of AI, MIT and Recurrent AI, Ltd.): released with the paper GPT Understands, Too by Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang and Jie Tang. We have made some slight modifications to make the algorithm suitable for the Chinese language.
  3. CP-Tuning (from Alibaba PAI): released with the paper Making Pre-trained Language Models End-to-end Few-shot Learners with Contrastive Prompt Tuning by Ziyun Xu, Chengyu Wang, Minghui Qiu, Fuli Luo, Runxin Xu, Songfang Huang and Jun Huang.
  4. Vanilla KD (from Alibaba PAI): distilling the logits of large BERT-style models to smaller ones.
  5. Meta KD (from Alibaba PAI): released with the paper Meta-KD: A Meta Knowledge Distillation Framework for Language Model Compression across Domains by Haojie Pan, Chengyu Wang, Minghui Qiu, Yichang Zhang, Yaliang Li and Jun Huang.
  6. Data Augmentation (from Alibaba PAI): augmentating the data based on the MLM head of pre-trained language models.

EasyNLP provides a simple toolkit to benchmark clue datasets. You can simply use just this command to benchmark CLUE dataset.

# Format: bash run_clue.sh device_id train/predict dataset
# e.g.: 
bash run_clue.sh 0 train csl

We've tested chiese bert and roberta modelson the datasets, the results of dev set are:

(1) bert-base-chinese:

Task AFQMC CMNLI CSL IFLYTEK OCNLI TNEWS WSC
P 72.17% 75.74% 80.93% 60.22% 78.31% 57.52% 75.33%
F1 52.96% 75.74% 81.71% 60.22% 78.30% 57.52% 80.82%

(2) chinese-roberta-wwm-ext:

Task AFQMC CMNLI CSL IFLYTEK OCNLI TNEWS WSC
P 73.10% 80.75% 80.07% 60.98% 80.75% 57.93% 86.84%
F1 56.04% 80.75% 81.50% 60.98% 80.75% 57.93% 89.58%

Here is the detailed CLUE benchmark example.

Tutorials

License

This project is licensed under the Apache License (Version 2.0). This toolkit also contains some code modified from other repos under other open-source licenses. See the NOTICE file for more information.

ChangeLog

  • EasyNLP v0.0.3 was released in 01/04/2022. Please refer to tag_v0.0.3 for more details and history.

Contact Us

Scan the following QR codes to join Dingtalk discussion group. The group discussions are mostly in Chinese, but English is also welcomed.

Reference

We have an arxiv paper for you to cite for the EasyNLP library:

@article{easynlp,
  doi = {10.48550/ARXIV.2205.00258},  
  url = {https://arxiv.org/abs/2205.00258},  
  author = {Wang, Chengyu and Qiu, Minghui and Zhang, Taolin and Liu, Tingting and Li, Lei and Wang, Jianing and Wang, Ming and Huang, Jun and Lin, Wei},
  title = {EasyNLP: A Comprehensive and Easy-to-use Toolkit for Natural Language Processing},
  publisher = {arXiv},  
  year = {2022}
}

easynlp's People

Contributors

0xjeffro avatar alibaba-oss avatar artiprocher avatar bingyan-liu avatar ceciliacenchen avatar chen9154 avatar chywang avatar co63oc avatar deplay avatar edmundyanj avatar jerryli1981 avatar jhuang1207 avatar jpwang avatar kiritod avatar kk990926 avatar lindawxd avatar longpai avatar lwmlyy avatar lzysaltedfish avatar minghui avatar nicholascao avatar olive331 avatar tangmoming avatar ttliu-kiwi avatar wjn1996 avatar yanfangli85 avatar zhuxiangru avatar ztl-35 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

easynlp's Issues

./run_fewshot_cpt.sh ERROR

TypeError: type object got multiple values for keyword argument 'user_defined_parameters'

Using Contrastive Few shot Learner, using random label words only as place-holders
Traceback (most recent call last):
File "/home/wpf/anaconda3/envs/fewshot/lib/python3.8/site-packages/easynlp/appzoo/api.py", line 459, in
default_main_fn()
File "/home/wpf/anaconda3/envs/fewshot/lib/python3.8/site-packages/easynlp/appzoo/api.py", line 423, in default_main_fn
evaluator = get_application_evaluator(
File "/home/wpf/anaconda3/envs/fewshot/lib/python3.8/site-packages/easynlp/appzoo/api.py", line 303, in get_application_evaluator
return evaluator[key](valid_dataset, user_defined_parameters=user_defined_parameters, **kwargs)
File "/home/wpf/anaconda3/envs/fewshot/lib/python3.8/site-packages/easynlp/fewshot_learning/fewshot_evaluator.py", line 131, in init
anchor_dataset = FewshotBaseDataset(
TypeError: type object got multiple values for keyword argument 'user_defined_parameters'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 209685) of binary: /home/wpf/anaconda3/envs/fewshot/bin/python
Traceback (most recent call last):
File "/home/wpf/anaconda3/envs/fewshot/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/wpf/anaconda3/envs/fewshot/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/wpf/anaconda3/envs/fewshot/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/home/wpf/anaconda3/envs/fewshot/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/wpf/anaconda3/envs/fewshot/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/wpf/anaconda3/envs/fewshot/lib/python3.8/site-packages/torch/distributed/run.py", line 689, in run
elastic_launch(
File "/home/wpf/anaconda3/envs/fewshot/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 116, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/wpf/anaconda3/envs/fewshot/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 244, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:


/home/wpf/anaconda3/envs/fewshot/lib/python3.8/site-packages/easynlp/appzoo/api.py FAILED

Root Cause:
[0]:
time: 2022-04-28_19:18:11
rank: 0 (local_rank: 0)
exitcode: 1 (pid: 209685)
error_file: <N/A>
msg: "Process failed with exitcode 1"

BUGs about amp

When I train the models with amp, we find the model cannot be Converged.
image

I find that there are some bugs in /easynlp/core/trainer.py, as shown in follow:
image

I analyze: the code in the red box means clearing grads, but when using amp, it cannot execute this code, which cause the problem.

Now I have resolved it and will commit a pull request in the latter.

We recommend using optimizer with amp in the following four settings:

  • bertadam:self._optimizer.step() + self._optimizer.zero_grad();
  • bertadam+amp:self._scaler.step(self._optimizer) + self._scaler.update() + self._optimizer.zero_grad()
  • adamw:torch.nn.utils.clip_grad_norm_() + self._optimizer.step() + self._lr_scheduler.step() + self._optimizer.zero_grad()
  • adamw+amp:torch.nn.utils.clip_grad_norm_() + self._scaler.step(self._optimizer) self._scaler.update() + self._lr_scheduler.step() + self._optimizer.zero_grad()

your API does not work

It seems that your API does not work when I try to use the AutoTokenizer.frompretrained(), I face the error message which shows "raise Exception(f'{pretrained_model_name_or_path} is not a filer or folder.')"
I also tried the code which is included in your GitHub repository and again I received the same error message.

Failed to run example script for sequence_classification_multilabel: Target size (torch.Size([32])) must be the same as input size (torch.Size([32, 27]))

I copy the training command from here and modify a little.

python -m torch.distributed.launch $DISTRIBUTED_ARGS main.py \ --mode train \ --tables=train.csv,dev.csv \ --input_schema=content_seq:str:1,label:str:1 \ --first_sequence=content_seq \ --label_name=label \ --label_enumerate_values=母婴,三农,科学,美文,科技,时尚,房产,美食,艺术,职场,健康,财经,国际,家居,娱乐,文化,教育,游戏,读书,动漫,体育,旅游,汽车,搞笑,健身,宠物,育儿 \ --checkpoint_dir=./multi_label_classification_model \ --learning_rate=3e-5 \ --epoch_num=1 \ --random_seed=42 \ --save_checkpoint_steps=100 \ --sequence_length=128 \ --train_batch_size=32 \ --app_name=text_classify \ --user_defined_parameters=' pretrain_model_name_or_path=hfl/chinese-roberta-wwm-ext multi_label=True '

And here's the error stack.
image

It seems the Target has a wrong size (torch.Size([32])) which should be torch.Size([32, 27]) under the multilabel settings.

sequence_generation not implemented

I copy the training command from here and modify a little.

easynlp \ --app_name=sequence_generation \ --mode train \ --worker_gpu=1 \ --tables=./cn_train.tsv,./cn_dev.tsv \ --input_schema=title_tokens:str:1,content_tokens:str:1 \ --first_sequence=content_tokens \ --second_sequence=title_tokens \ --label_name=title_tokens \ --checkpoint_dir=./finetuned_zh_model \ --micro_batch_size=8 \ --sequence_length=512 \ --save_checkpoint_steps=150 \ --export_tf_checkpoint_type none \ --user_defined_parameters 'pretrain_model_name_or_path=alibaba-pai/mt5-title-generation-zh copy=false max_encoder_length=512 min_decoder_length=12 max_decoder_length=32 no_repeat_ngram_size=2 num_beams=5 num_return_sequences=5'

And here's the error stack.
image

I checked /home/pai/lib/python3.6/site-packages/easynlp-0.0.5-py3.6.egg/easynlp/appzoo/api.py, found the module of SequenceGeneration seems not been implemented.
image

datasets support

  1. add dataset page, support Chinese datasets download
  2. support SOTA Chinese models

No module named 'easy_predict'

当运行 main.py 时,报错找不到 easy_predict 模块:

/usr/bin/python3 /Users/gavin/Downloads/PycharmProjects/EasyNLP/EasyNLP/examples/quick_start/main.py
No module named 'easy_predict'
Traceback (most recent call last):
File "/Users/gavin/Downloads/PycharmProjects/EasyNLP/EasyNLP/examples/quick_start/main.py", line 5, in
args = initialize_easynlp()
File "/Users/gavin/Downloads/PycharmProjects/EasyNLP/EasyNLP/easynlp/utils/initializer.py", line 34, in initialize_easynlp
set_global_variables(extra_args_provider=extra_args_provider,
File "/Users/gavin/Downloads/PycharmProjects/EasyNLP/EasyNLP/easynlp/utils/global_vars.py", line 144, in set_global_variables
args = _parse_args(extra_args_provider=extra_args_provider,
File "/Users/gavin/Downloads/PycharmProjects/EasyNLP/EasyNLP/easynlp/utils/global_vars.py", line 204, in _parse_args
_GLOBAL_ARGS = parse_args(extra_args_provider=extra_args_provider,
File "/Users/gavin/Downloads/PycharmProjects/EasyNLP/EasyNLP/easynlp/utils/arguments.py", line 70, in parse_args
assert args.tables is not None
AssertionError

环境:
$ python -V
Python 3.8.9

$ pip list
Package Version


absl-py 1.0.0
beautifulsoup4 4.11.1
bs4 0.0.1
cachetools 5.0.0
certifi 2021.10.8
charset-normalizer 2.0.12
click 8.1.2
cycler 0.11.0
filelock 3.6.0
fonttools 4.33.2
google-auth 2.6.6
google-auth-oauthlib 0.4.6
grpcio 1.45.0
idna 3.3
importlib-metadata 4.5.0
jieba 0.42.1
joblib 1.1.0
kiwisolver 1.4.2
lxml 4.8.0
Markdown 3.3.6
matplotlib 3.5.1
nltk 3.7
numpy 1.22.3
oauthlib 3.2.0
opencv-python 4.5.5.64
packaging 21.3
pandas 1.4.2
pandas-datareader 0.10.0
Pillow 9.1.0
pip 22.0.4
protobuf 3.20.1
pyasn1 0.4.8
pyasn1-modules 0.2.8
pyparsing 3.0.8
python-dateutil 2.8.2
pytz 2022.1
regex 2022.4.24
requests 2.27.1
requests-oauthlib 1.3.1
rouge 1.0.1
rsa 4.8
sacremoses 0.0.49
scikit-learn 1.0.2
scipy 1.8.0
sentencepiece 0.1.96
setuptools 62.1.0
six 1.16.0
soupsieve 2.3.2.post1
tensorboard 2.9.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
tensorboardX 2.5
threadpoolctl 3.1.0
tokenizers 0.10.1
torch 1.9.0
torchvision 0.12.0
tqdm 4.64.0
typing_extensions 4.2.0
urllib3 1.26.9
Werkzeug 2.1.2
wheel 0.33.1
zipp 3.8.0

Process finished with exit code 1

add web demo/model to Huggingface

Hi, would you be interested in adding EasyNLP to Hugging Face? The Hub offers free hosting, and it would make your work more accessible and visible to the rest of the ML community. Models/datasets/spaces(web demos) can be added to a user account or organization similar to github.

Example from other organizations:
Keras: https://huggingface.co/keras-io
Microsoft: https://huggingface.co/microsoft
Facebook: https://huggingface.co/facebook

Example spaces with repos:
github: https://github.com/salesforce/BLIP
Spaces: https://huggingface.co/spaces/salesforce/BLIP

github: https://github.com/facebookresearch/omnivore
Spaces: https://huggingface.co/spaces/akhaliq/omnivore

and here are guides for adding spaces/models/datasets to your org

How to add a Space: https://huggingface.co/blog/gradio-spaces
how to add models: https://huggingface.co/docs/hub/adding-a-model
uploading a dataset: https://huggingface.co/docs/datasets/upload_dataset.html

Please let us know if you would be interested and if you have any questions, we can also help with the technical implementation.

RuntimeError: device type error

I copy the training command from text match tutorial and modify a little.

python -m torch.distributed.launch $DISTRIBUTED_ARGS main.py \ --mode train \ --worker_gpu=1 \ --tables=train.csv,dev.csv \ --input_schema=example_id:str:1,sent1:str:1,sent2:str:1,label:str:1,cate:str:1,score:str:1 \ --first_sequence=sent1 \ --second_sequence=sent2 \ --label_name=label \ --label_enumerate_values=0,1 \ --checkpoint_dir=./text_match_two_tower_model_dir \ --learning_rate=3e-5 \ --epoch_num=1 \ --random_seed=42 \ --save_checkpoint_steps=100 \ --sequence_length=128 \ --train_batch_size=32 \ --app_name=text_match \ --user_defined_parameters=' pretrain_model_name_or_path=hfl/chinese-roberta-wwm-ext two_tower=True loss_type=hinge_loss margin=0.45 gamma=32 embedding_size=256 '

(the code in knowledge_language_understanding shares the same error.

And here's the error stack in text_match.
image

脚本里的export CUDA_VISIBLE_DEVICES不起作用

脚本里的export CUDA_VISIBLE_DEVICES不起作用,main函数里的os.environ["CUDA_VISIBLE_DEVICES"] = "1"也没有作用,不论设置多少值都会默认在0号卡上运行。
不知道是不是bug

AttributeError when decode odps records

Original Traceback (most recent call last):
File "/home/pai/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/pai/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/pai/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/apsara/TempRoot/Odps/new_retail_algo_20220418080449486g35fjmu2_0dcc86d9_5d4e_414d_a8d4_e33635b8ffde_AlgoTask_0_0/[email protected]#0/workspace/easynlp/appzoo/dataset.py", line 180, in getitem
row = "\t".join([t.decode("utf-8") for t in row[0]])
File "/apsara/TempRoot/Odps/new_retail_algo_20220418080449486g35fjmu2_0dcc86d9_5d4e_414d_a8d4_e33635b8ffde_AlgoTask_0_0/[email protected]#0/workspace/easynlp/appzoo/dataset.py", line 180, in
row = "\t".join([t.decode("utf-8") for t in row[0]])
AttributeError: 'int' object has no attribute 'decode'

_common_io.UserException: table/table_buffer.cpp(93): UserException: Read table time out!

参考脚本quick_start_user_defined/run_user_defined_pai.sh
输入是odps表,输出模型到oss
如果在main.py 中将 evaluator = None替换为
evaluator = get_application_evaluator(app_name=args.app_name,valid_dataset=valid_dataset,user_defined_parameters=user_defined_parameters,eval_batch_size=args.micro_batch_size)
时会出现_common_io.UserException: table/table_buffer.cpp(93): UserException: Read table time out!错误,反复运行多次,大概都在正常运行20分钟左右会出现。

如果设置evaluator=None则正常训练。

Failed to run example script for CP-Tuning: type object got multiple values for keyword argument 'user_defined_parameters'

I'm using pai-easynlp 0.0.3 installed by pip.

I copy the training command from here and modify a little.

echo '=========[ Fewshot Training: CP-Tuning on Text Classification ]========='
easynlp \
    --app_name=text_classify \
    --mode=train \
    --worker_count=1 \
    --worker_gpu=1 \
    --tables=data/fewshot_train.tsv,data/fewshot_dev.tsv \
    --input_schema=sid:str:1,sent1:str:1,sent2:str:1,label:str:1 \
    --first_sequence=sent1 \
    --second_sequence=sent2 \
    --label_name=label \
    --label_enumerate_values=0,1 \
    --checkpoint_dir=./fewshot_model/ \
    --learning_rate=1e-5 \
    --epoch_num=1 \
    --random_seed=42 \
    --save_checkpoint_steps=100 \
    --sequence_length=512 \
    --micro_batch_size=8 \
    --user_defined_parameters="
        pretrain_model_name_or_path=/path/to/pretrained/model
        enable_fewshot=True
        type=cpt_fewshot
        pattern=sent1,label,用,sent2,概括。
    "

And here's the error stack.
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.