GithubHelp home page GithubHelp logo

ccnudhj / kevinpro-nlp-demo Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ricardokevins/kevinpro-nlp-demo

0.0 0.0 0.0 92.48 MB

个人的NLP实践demo,包含了文本分类,对话机器人,Transformer,GPT实现,图神经网络使用,对抗训练,摘要抽取,知识蒸馏,变分自编码器等pytorch实现

Python 97.65% Jupyter Notebook 2.35%

kevinpro-nlp-demo's Introduction

NLP

个人的NLP实践demo。部分来源于其他开源项目(侵删)。欢迎Star Fork以及PR。有问题发Issue,我会回复的。

Some Simple implement of Fun NLP algorithm in Pytorch. updating and maintaining

If you have problems, please comment in Issue

主要内容(具体见各个项目内部的README)

  1. 文本分类,BiLSTM,Transfomer
  2. 摘要生成,Pointer Generator NetWork
  3. 对话翻译 Seq2Seq
  4. GNN在文本分类的实践
  5. Transformer Mask Language Model预训练
  6. GPT文本续写以及GPT做数学题(偷的hhh)
  7. 其他的NLP炼丹技巧实践 对抗学习等
  8. 新增两个大佬的Transformer实现,来源注于代码中(实现的很漂亮,对于理解很有帮助)

其他参考实践

  1. bert关系抽取Ricardokevins/Bert-In-Relation-Extraction: 使用Bert完成实体之间关系抽取 (github.com)
  2. 文本语意匹配Ricardokevins/Text_Matching: NLP2020中兴捧月句子相似度匹配 (github.com)
  3. Transfomer实现和其他部件Ricardokevins/EasyTransformer: Quick start with strong baseline of Bert and Transformer without pretrain (github.com)

更新记录

2021.9.29

  1. 在Transformer里增加了一个随机数字串恢复的Demo,对新手理解Transformer超友好,不需要外部数据,利用随机构造的数字串训练
  2. 新增实验TransfomerVAE,暂时有BUG,施工中

2021.1.23

  1. 初次commit 添加句子分类模块,包含Transformer和BiLSTM以及BiLSTM+Attn模型
  2. 上传基本数据集,句子二分类作为Demo例子
  3. 加上和使用对抗学习思路

2021.5.1

  1. 重新整理和更新了很多东西.... 略

2021.6.22

  1. 修复了Text Classification的一些整理问题
  2. 增加了Text Classification对应的使用说明

2021.7.2

  1. 增加了MLM预训练技术实践
  2. 修复了句子分类模型里,过分大且不必要的Word Embed(因为太懒,所以只修改了Transformer的)
  3. 在句子分类里增加了加载预训练的可选项
  4. 修复了一些BUG

2021.7.11

  1. 增加了GNN在NLP中的应用
  2. 实现了GNN在文本分类上的使用
  3. 效果不好,暂时怀疑是数据处理的问题

2021.7.29

  1. 增加了CHI+TFIDF传统机器学习算法在文本分类上的应用
  2. 实现和测试了算法性能
  3. 更新了README

2021.8.2

  1. 重构了对话机器人模型于Seq2Seq文件夹
  2. 实现了BeamSearch解码方式
  3. 修复了PGN里的BeamSearch Bug

2021.9.11

  1. 添加了GPT在文本续写和数学题问题的解决(偷了karpathy/minGPT: A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training (github.com)代码实现的很好,对理解GPT很有帮助,偷过来看看能不能用在好玩的东西
  2. 重构了Pointer Generator NetWork,之前的表现一直不好,打算干脆重构,一行一行的重新捋一遍,感觉会安心很多。施工ing。

2021.9.16

  1. 修复了Pretrain里Mask Token未对齐,位置不一致问题

kevinpro-nlp-demo's People

Contributors

ricardokevins avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.