GithubHelp home page GithubHelp logo

didi / chinesenlp Goto Github PK

View Code? Open in Web Editor NEW
1.8K 61.0 273.0 896 KB

Datasets, SOTA results of every fields of Chinese NLP

Home Page: https://chinesenlp.xyz

HTML 100.00%
nlp chinese-nlp machine-translation chinese-word-segmentation entity-linking question-answering nlp-tasks

chinesenlp's Introduction

Chinese NLP

Shared tasks, datasets and state-of-the-art results for Chinese Natural Language Processing (NLP)

Table of Tasks

Besides Chinese NLP

To track more progress in Natural Language Processing (NLP) in English and other languages, you can check NLP-progress, which includes the datasets and the current state-of-the-art for the most common NLP tasks.

What's new!

We have a new shared task at this year's WMT titled "Triangular MT: Using English to improve Russian-to-Chinese machine translation". We would encourage you to participate in this shared task.

More details of the shared task is present at http://www.statmt.org/wmt21/triangular-mt-task.html

Please pass on the pointer to any of your colleagues and friends who might be interested in participating.

Important Dates

  • Apr 5, 2021: Release of training and development resources
  • Apr 5, 2021: Release of the baseline system
  • Jul 12, 2021: Release of test data
  • Jul 19, 2021: Official submissions due by web upload
  • Jul 20, 2021: Release of the official results
  • Aug 5, 2021: System description paper due
  • Sep 5, 2021: Review feedback
  • Sep 15, 2021: Camera-ready papers due
  • Nov 10-11, 2021: Workshop

Contribute

Want to contribute? Please follow the Instructions

Contact

Suggestions? Changes? Please send email to [email protected]

Note

This project is initiated and actively maintained by DiDi NLP team under DiDi AI Labs.

chinesenlp's People

Contributors

ajaynagesh avatar amittai avatar c2huc2hu avatar chinesenlpxyz avatar huang17 avatar huyp182 avatar kevincrawfordknight avatar lowinli avatar ryskina avatar scotfang avatar shixing avatar wbtiger avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chinesenlp's Issues

Source code

How to access the source code for Relation Extraction. I could not find it in this repository.

咨询co-reference_resolution.md的问题

请问您在共指消解上实现的输入输出是这篇《Deep Reinforcement Learning for Mention-Ranking Coreference Models》文章吗?方便提供源码吗?

NLG

请问下为什么没有机器创作之类的,AI写诗,写作

有關中文摘要

我看到了你們在這裡發佈的訊息,與把 Abstractive summarization 整合進 library 覺得非常好
https://chinesenlp.xyz/docs/text_summarization.html

一年前我在碩士期間研究了 Chinese summarization
這是我的 Paper

當初發現了 LCSTS 的資料集上有缺陷,在 training set 和 testing set 上有很大一部分的重複
聯繫 dataset 的作者,他後來發佈了 LCSTS2.0
但依然有重複的部分,我們提出了 LCSTS2.0-clean
且發表了 hybrid-word-character 的方法,在最原始版本的 LCSTS 資料集上 ROUGE score 達到快 60,但 LCSTS2.0-clean 上沒有這麼好,但也比其他的模型好。

這證明兩件事情

  1. 資料集必須使用 LCSTS2.0,比較能公平的比較 model
  2. hybrid-word-character 的方法確實是非常有用的

這是我的 paper
https://arxiv.org/abs/1802.09968

看到 DiDi 開源了非常多的項目,覺得很棒
希望能在這裡跟大家一起討論,謝謝

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.