GithubHelp home page GithubHelp logo

cyberzhg / keras-gpt-2 Goto Github PK

View Code? Open in Web Editor NEW
128.0 7.0 32.0 97 KB

Load GPT-2 checkpoint and generate texts

Home Page: https://pypi.org/project/keras-gpt-2/

License: MIT License

Shell 1.36% Python 98.64%
keras gpt-2 nlp language-model

keras-gpt-2's Introduction

Keras GPT-2

Version License

[δΈ­ζ–‡|English]

Load pretrained weights and predict with GPT-2.

Install

pip install keras-gpt-2

Demo

import os
from keras_gpt_2 import load_trained_model_from_checkpoint, get_bpe_from_files, generate


model_folder = 'xxx/yyy/117M'
config_path = os.path.join(model_folder, 'hparams.json')
checkpoint_path = os.path.join(model_folder, 'model.ckpt')
encoder_path = os.path.join(model_folder, 'encoder.json')
vocab_path = os.path.join(model_folder, 'vocab.bpe')


print('Load model from checkpoint...')
model = load_trained_model_from_checkpoint(config_path, checkpoint_path)
print('Load BPE from files...')
bpe = get_bpe_from_files(encoder_path, vocab_path)
print('Generate text...')
output = generate(model, bpe, ['From the day forth, my arm'], length=20, top_k=1)

# If you are using the 117M model and top_k equals to 1, then the result will be:
# "From the day forth, my arm was broken, and I was in a state of pain. I was in a state of pain,"
print(output[0])

keras-gpt-2's People

Contributors

cedspam avatar cyberzhg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

keras-gpt-2's Issues

nuleus aka top_p sampling

Is your feature request related to a problem? Please describe.
better text generation as described in

@misc{holtzman2020curious,
title={The Curious Case of Neural Text Degeneration},
author={Ari Holtzman and Jan Buys and Li Du and Maxwell Forbes and Yejin Choi},
year={2020},
eprint={1904.09751},
archivePrefix={arXiv},
primaryClass={cs.CL}
}

Describe the solution you'd like
implementation from nshepperd/gpt-2@87fe3d7

About special token implementation

Is there a way to make special token like BERT model for classification? For example, making something like or instead of padding with 0s at the end of the text? Because the 0 in the vocabulary means "!" but not /

DistilGPT2

Hi

You guys do fantastic work. I am using your libraries mostly from kashgari.

I've been looking around for more models to try out, and I was hoping to reduce model size. Huggingface were doing work distilling models. I've been trying to load in the weights from their model, but couldn't match up the weights.

import h5py
import numpy as np

def get_weights(weight_file_path):
    """
    Prints out the structure of HDF5 file.

    Args:
      weight_file_path (str) : Path to the file to analyze
    """
    f = h5py.File(weight_file_path)
    try:
        if len(f.attrs.items()):
            print("{} contains: ".format(weight_file_path))
            print("Root attributes:")
        for key, value in f.attrs.items():
            print("  {}: {}".format(key, value))

        if len(f.items())==0:
            return 

        for layer, g in f.items():
            print("  {}".format(layer))
            print("    Attributes:")
            for key, value in g.attrs.items():
                print("      {}: {}".format(key, value))


            weights = {v: np.array(g[v]) for v in value}
    finally:
        f.close()

    return weights

I set up my model like this:

config = {'n_ctx': 1024, 'n_embd': 768, 'n_head': 12, 'n_layer': 6, 'n_vocab': 50257}

model = get_model(
    **config2,
    batch_size=None,
    fixed_input_shape=True,
)

... but I really struggle to match up the weights.

The first layers match up, but then I get this size mismatch with the Encode-0-MultiHeadAtt layer:

shape new weights: [(768, 2304), (1, 2304), (768, 768), (1, 768), (768,), (768,), (768, 3072), (1, 3072)]
shape old weights: [(768, 768), (768,), (768, 768), (768,), (768, 768), (768,), (768, 768), (768,)]

Is there something you could suggest to fix this?

Here's the code to their (Distil)GPT2.

Cheers!
Ben.

Could you write a demo on how to run this model on a TPU?

Is your feature request related to a problem? Please describe.
I've been trying to train/predict with gpt-2 on colab with TPU support, and I've met only hurdles. I see you have a parameter for supporting tpus in the model loader, but colab doesn't like a pure keras model. I've tried to convert it to a tf.keras or a tf estimator, but neither worked for me.

Training

Is your feature request related to a problem? Please describe.
There is no documentation about training.

Describe the solution you'd like
A simple script/test or doc to train on my dataset.

Describe alternatives you've considered
N/A

Additional context
If i know how to train model, I will be able to prepare dataset accordingly.

Incompatible depandence warning using with keras-bert.

Describe the Bug

Pip install will report ERROR while using keras-bert with keras-gpt2.

ERROR: keras-gpt-2 0.10.0 has requirement keras-transformer==0.25.0, but you'll have keras-transformer 0.23.0 which is incompatible.

Version Info

keras-bert==0.57.0
keras-gpt-2==0.10.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.