ak9250 / gpt-2-colab Goto Github PK
View Code? Open in Web Editor NEWretrain gpt-2 in colab
retrain gpt-2 in colab
Hello,
How much memory is required to run this model? I get out of memory errors with a graphic card having 6GB of memory.
Not enough memory or local problem?
Thanks,
Based on my understanding, gpt or gpt-2 are using language model loss to train and generate text, which do not contains GAN.
So which is better: GPT vs RelGAN/LeakGAN/SeqGAN/TextGAN
I am so confused about this question. Thank you very much.
Since toposort
has been recently added to the requirements.txt
of the nshepperd project, we can safely remove the line used to install it in the Colab notebook.
Works great, thanks.
I've just been confused by the missing
%
before the cd commands ...
(brand new to colab ...)
I am new to tensorflow but something seems changed. I get this when i try to train
Traceback (most recent call last):
File "./train.py", line 14, in <module>
import model, sample, encoder
File "/content/gpt-2/src/model.py", line 3, in <module>
from tensorflow.contrib.training import HParams
ModuleNotFoundError: No module named 'tensorflow.contrib'
I even tried %tensorflow_version 1.x
First, I'm not sure whether the model contains the encoder during training.
EOS means end-of-sentence. Encoder and decoder are part of transformer network.
If without-encoder, training time:
target: [E, F, G, H, EOS]
decoder input: [0, E, F, G, H]
If without-encoder, testing time:
decoder input: [0]
If with encoder, training time:
encoder input: [A, B, C, D]
target: [E, F, G, H, EOS]
decoder input: [0, E, F, G, H]
If with-encoder, testing time:
encoder input: [A, B, C, D]
decoder input: [0]
Am I exact right?
I know it is beyond the topic of this project, but hope you could help.
Thank you and thank you.
hi
im quite new to this but when i put this on my own server and let it train on my own text data
after getting close to avg loss 0.0 it started spitting out exact paragraphs from my trained data.
did i overtrain it or set some flags wrong?
I have trained a model on my data. Now how to predict from a model by inputting some text?
Im getting error while training on this cell
!PYTHONPATH=src ./train.py --dataset /content/gpt-2/goblet_book.txt --model_name '345M'
Traceback (most recent call last): File "./train.py", line 266, in <module> main() File "./train.py", line 244, in main feed_dict={context: sample_batch()}) File "./train.py", line 220, in sample_batch return [data_sampler.sample(1024) for _ in range(args.batch_size)] File "./train.py", line 220, in <listcomp> return [data_sampler.sample(1024) for _ in range(args.batch_size)] File "/content/gpt-2/src/load_dataset.py", line 74, in sample self.chunks ZeroDivisionError: integer division or modulo by zero
Can't download model.sh anymore, any reason why?
Like the title, i just want to train gpt-2 model with my dataset, rather than re-training or fine-tuning on a base model.
Looking forward ur reply!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.