GithubHelp home page GithubHelp logo

yehjames / gpt2-ml Goto Github PK

View Code? Open in Web Editor NEW

This project forked from imcaspar/gpt2-ml

0.0 1.0 0.0 928 KB

GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型

License: Apache License 2.0

Python 93.97% Shell 1.68% Jupyter Notebook 2.43% Perl 1.92%

gpt2-ml's Introduction

gpt2-ml

GPT2 for Multiple Languages

Open In Colab GitHub GitHub All Releases contributions welcome GitHub stars Twitter Follow

中文说明 | English

  • Simplified GPT2 train scripts(based on Grover, supporting TPUs)
  • Ported bert tokenizer,multilingual corpus compatible
  • 1.5B GPT2 pretrained Chinese model ( ~15G corpus, 10w steps )
  • Batteries-included Colab demo #
  • 1.5B GPT2 pretrained Chinese model ( ~50G corpus, 100w steps )

Pretrained Model

1.5B GPT2 pretrained Chinese model [Google Drive]

SHA256: 4a6e5124df8db7ac2bdd902e6191b807a6983a7f5d09fb10ce011f9a073b183e

Corpus from THUCNews and nlp_chinese_corpus

Using Cloud TPU Pod v3-256 to train 10w steps

loss

Google Colab

With just 2 clicks (not including Colab auth process), the 1.5B pretrained Chinese model demo is ready to go:

[Colab Notebook]

Train

Disclaimer

The contents in this repository are for academic research purpose, and we do not provide any conclusive remarks.

Citing

@misc{GPT2-ML,
  author = {Zhibo Zhang},
  title = {GPT2-ML: GPT-2 for Multiple Languages},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/imcaspar/gpt2-ml}},
}

Reference

https://github.com/google-research/bert

https://github.com/rowanz/grover

Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC)

gpt2-ml's People

Contributors

erjanmx avatar imcaspar avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.