didi / chinesenlp Goto Github PK

View Code? Open in Web Editor NEW

1.8K 61.0 273.0 896 KB

Datasets, SOTA results of every fields of Chinese NLP

Home Page: https://chinesenlp.xyz

HTML 100.00%

nlp chinese-nlp machine-translation chinese-word-segmentation entity-linking question-answering nlp-tasks

chinesenlp's Introduction

Chinese NLP

Shared tasks, datasets and state-of-the-art results for Chinese Natural Language Processing (NLP)

Table of Tasks

Besides Chinese NLP

To track more progress in Natural Language Processing (NLP) in English and other languages, you can check NLP-progress, which includes the datasets and the current state-of-the-art for the most common NLP tasks.

What's new!

We have a new shared task at this year's WMT titled "Triangular MT: Using English to improve Russian-to-Chinese machine translation". We would encourage you to participate in this shared task.

More details of the shared task is present at http://www.statmt.org/wmt21/triangular-mt-task.html

Please pass on the pointer to any of your colleagues and friends who might be interested in participating.

Important Dates

Apr 5, 2021: Release of training and development resources
Apr 5, 2021: Release of the baseline system
Jul 12, 2021: Release of test data
Jul 19, 2021: Official submissions due by web upload
Jul 20, 2021: Release of the official results
Aug 5, 2021: System description paper due
Sep 5, 2021: Review feedback
Sep 15, 2021: Camera-ready papers due
Nov 10-11, 2021: Workshop

Contribute

Want to contribute? Please follow the Instructions

Contact

Suggestions? Changes? Please send email to [email protected]

Note

This project is initiated and actively maintained by DiDi NLP team under DiDi AI Labs.

chinesenlp's People

Contributors

Stargazers

Watchers

Forkers

huang17 frankchu0229 zofuthan awesome-archive allensmile dazer-chen amirunpri2018 hfxunlp tianyikenan gavingx lishiqimagic zhgwen xuh5156 williamyzd jinchaocai hitaitengteng adewin chapzq77 guoyin90 zhangjiekui yangyang117 huaxz1986 2585575866 zp1481616577 zxyscz laurentluojiawei cdj0311 sigmaquan zbaoli wxl18039675170 xiaogangwang2092 tchigher joseph-mutu konami86 zhuty94 shangsrs songym2020 whmadan cqray1990 wmaa0002 caijunyu wurentidai aiedward hanksantford anke522 wbcwubingchao zhenyangze fengjiachunfromsysu ricegithup zhuxb communicateconnectcreate yanlirock csqjxiao tensorui hit16s rebel2019 kernal-gh maggiewx yejiahaoye fxue029 laisun zhenghaihui315 susangzj noobpythoner wrecking1 jinxiu0406 lhhriver cooperleong00 zhongyunuestc mingtianxiatian liuyuru156 sixingyan little1tow shuoranly jjl1994 bsdcfp sundice zenghaihong may-sunshine huifeng168 magicalchao sdgdsffdsfff cantaloupejinjin lh569218 shmct yifanjun233 alphadl minsifansi nonva lalalaashen wcc14336 andrea-lee berstpander wuwx 2226171237 sunny121li taomiao zeensong researchmore huyp182

chinesenlp's Issues

你好，可以分享一下LCSTS2.0原始数据集吗？

你好，可以分享一下LCSTS2.0原始数据集吗？我按官方的填写了申请，一直没有收到他们的回信，CSDN的链接也失效的，多谢。[email protected]

有机器阅读理解的吗？

如题，求更

Source code

How to access the source code for Relation Extraction. I could not find it in this repository.

请问有收集基于aspect的情感分析的相关资料吗？

有可以玩的 demo 网页？

如题

咨询co-reference_resolution.md的问题

请问您在共指消解上实现的输入输出是这篇《Deep Reinforcement Learning for Mention-Ranking Coreference Models》文章吗？方便提供源码吗？

NLG

请问下为什么没有机器创作之类的，AI写诗，写作

请问有中文多标签数据集吗

有關中文摘要

我看到了你們在這裡發佈的訊息，與把 Abstractive summarization 整合進 library 覺得非常好
https://chinesenlp.xyz/docs/text_summarization.html

一年前我在碩士期間研究了 Chinese summarization
這是我的 Paper

當初發現了 LCSTS 的資料集上有缺陷，在 training set 和 testing set 上有很大一部分的重複
聯繫 dataset 的作者，他後來發佈了 LCSTS2.0
但依然有重複的部分，我們提出了 LCSTS2.0-clean
且發表了 hybrid-word-character 的方法，在最原始版本的 LCSTS 資料集上 ROUGE score 達到快 60，但 LCSTS2.0-clean 上沒有這麼好，但也比其他的模型好。

這證明兩件事情

資料集必須使用 LCSTS2.0，比較能公平的比較 model
hybrid-word-character 的方法確實是非常有用的

這是我的 paper
https://arxiv.org/abs/1802.09968

看到 DiDi 開源了非常多的項目，覺得很棒
希望能在這裡跟大家一起討論，謝謝

后面可以增加一个文本相似度的二分类任务？

请问是否可以识别性别

通过说话人的语言风格。