as-ideas / headliner Goto Github PK
View Code? Open in Web Editor NEW๐ Easy training and deployment of seq2seq models.
Home Page: https://as-ideas.github.io/headliner/
License: Other
๐ Easy training and deployment of seq2seq models.
Home Page: https://as-ideas.github.io/headliner/
License: Other
hello,i want to kown does headliner supper beam search way when pridection,if it support, how can i use it,the document dont mention it, and when i check the source code, it seems you use the greedy search during decoding.
many thanks.
Hi,
has there been any research done from your side comparing BLEU score to SOTA results for this pre-trained BERT for MT?
Or is it merely to illustrate the flexibility of the library, e.g, attaching a decoder and re-training on a custom dataset?
Hi,
Thank you so much for sharing this library. It seems to be very easy to use compared to others.
I try to import the library to reproduce your example but I receive an error message :
File "/home/ubuntu/.local/lib/python3.5/site-packages/headliner/model/summarizer.py", line 14
self.vectorizer: Union[Vectorizer, None] = None
^
SyntaxError: invalid syntax
I am running Python 3.5.2.
Am I doing something wrong?
Thanks
Currently if a model is retrained the embedding is switched to trainable=true. Use an additional flag for trainable embeddings that is restored when loading the model.
I am training a transformer model on kaggle. After the training, models are not getting saved. Earlier it was working fine before the addition of BERT transformer.
Here is the notebook: https://www.kaggle.com/mohitsaini235/chatbot?scriptVersionId=22931537
I have commented out some code so that you can run it quickly and see the results.
I am trying to use Transformer to predict a sequence from another sequence. But I failed to see how can I bypass the tokenizer so that I can directly send my sequence to Transformer.
Hi,
First of all thanx for bringing such an easy to use Sequence-to-Sequence NN to open source.
Actually I was thinking to use HeadLiner, and for testing I started Training with custom data of AmazonFoodReviews for a summarization model. But it ended up with a loss of "4.662851969401042".
I was using TransformerSummarizer to train the custom model and the code is here below.
summarizer = TransformerSummarizer(num_heads=1,embedding_size=64, max_prediction_len=20) trainer = Trainer(batch_size=2, steps_per_epoch=100) trainer.train(summarizer, training_data, num_epochs=100) summarizer.save('/tmp/summarizer')
Training data was in form of tuple (only 10 samples to print here) :
[('have bought several of the vitality canned dog food products and have found them all to be of good quality the product looks more like stew than processed meat and it smells better my labrador is finicky and she appreciates this product better than most', 'good quality dog food'), ('product arrived labeled as jumbo salted peanuts the peanuts were actually small sized unsalted not sure if this was an error or if the vendor intended to represent the product as jumbo', 'not as advertised'), ('this is confection that has been around few centuries it is light pillowy citrus gelatin with nuts in this case filberts and it is cut into tiny squares and then liberally coated with powdered sugar and it is tiny mouthful of heaven not too chewy and very flavorful highly recommend this yummy treat if you are familiar with the story of lewis the lion the witch and the wardrobe this is the treat that seduces edmund into selling out his brother and sisters to the witch', 'delight says it all'), ('if you are looking for the secret ingredient in robitussin believe have found it got this in addition to the root beer extract ordered and made some cherry soda the flavor is very medicinal', 'cough medicine'), ('great taffy at great price there was wide assortment of yummy taffy delivery was very quick if your taffy lover this is deal', 'great taffy'), ('got wild hair for taffy and ordered this five pound bag the taffy was all very enjoyable with many flavors watermelon root beer melon peppermint grape etc my only complaint is there was bit too much red black licorice flavored pieces between me my kids and my husband this lasted only two weeks would recommend this brand of taffy it was delightful treat', 'nice taffy'), ('this saltwater taffy had great flavors and was very soft and chewy each candy was individually wrapped well none of the candies were stuck together which did happen in the expensive version fralinger would highly recommend this candy served it at beach themed party and everyone loved it', 'great just as good as the expensive brands'), ('this taffy is so good it is very soft and chewy the flavors are amazing would definitely recommend you buying it very satisfying', 'wonderful tasty taffy'), ('right now am mostly just sprouting this so my cats can eat the grass they love it rotate it around with wheatgrass and rye too', 'yay barley'), ('this is very healthy dog food good for their digestion also good for small puppies my dog eats her required amount at every feeding', 'healthy dog food')]
vocab encoder: 18122, vocab decoder: 4439
But the prediction failed badly.
Then I started to give a try to BertSummarizer on same dataset.
Then it gives the error : TypeError:
generator yielded an element that did not match the expected structure. The expected structure was (tf.int32, tf.int32, tf.int32), but the yielded element was ([3, 7293, 1725, 14131, 10785, 16089, 17337, 2220, 4703, 6185, 12287, 574, 7293, 6281, 16104, 414, 16352, 1242, 10785, 6793, 12569, 16089, 12274, 9204, 10148, 9029, 15200, 16066, 12257, 9667, 574, 8317, 14571, 1408, 10321, 8751, 8290, 5960, 574, 14182, 736, 16193, 12274, 1408, 16066, 10171, 2], [3, 1655, 3098, 1136, 1500, 2]).
.
I would be very thankful if you help me out.
Thanx.
Hi @cschaefer26 and @datitran ,
Thank you so much for the headliner library, I have been playing around it for a while and so far really enjoyed it.
I was working on a use case for summarization and use headliner's bert model, I followed the following readme code (with few tweaks for SummarizerTransformer parameters) for headliner with my use case train_data:
from headliner.preprocessing import Preprocessor
train_data = [('Some inputs.', 'Some outputs.')] * 10
# use BERT-specific start and end token
preprocessor = Preprocessor(start_token='[CLS]',
end_token='[SEP]',
lower_case=True)
train_prep = [preprocessor(t) for t in train_data]
targets_prep = [t[1] for t in train_prep]
from tensorflow_datasets.core.features.text import SubwordTextEncoder
from transformers import BertTokenizer
from headliner.model import SummarizerBert
# Use a pre-trained BERT embedding and BERT tokenizer for the encoder
tokenizer_input = BertTokenizer.from_pretrained('bert-base-uncased')
tokenizer_target = SubwordTextEncoder.build_from_corpus(
targets_prep, target_vocab_size=2**13, reserved_tokens=[preprocessor.start_token, preprocessor.end_token])
vectorizer = Vectorizer(tokenizer_input, tokenizer_target)
summarizer = SummarizerBert(num_heads=4,
feed_forward_dim=512,
num_layers_encoder=3,
num_layers_decoder=3,
bert_embedding_encoder='bert-base-uncased',
embedding_encoder_trainable=False,
embedding_size_encoder=768,
embedding_size_decoder=64,
dropout_rate=0.1,
max_prediction_len=400)
)
summarizer.init_model(preprocessor, vectorizer)
trainer = Trainer(batch_size=2)
trainer.train(summarizer, train_data, num_epochs=200)
I train the model for 200 epochs and i see in logs that the loss keeps on reducing (starting from around 4 to 0.69), so seems like training happens just fine.
After that when i try to do prediction on the saved model using following code, it gives me the same prediction always for any test_sentence :
from headliner.model.summarizer_bert import SummarizerBert
summarizer = SummarizerBert.load('/path/to/headliner_bert_model')
summarizer.predict(test_sentence)
Please if anyone can advise if am missing anything? with respect to prediction part for bert model?
I followed the documentation and got error for the following code:
Training part
NUM_UNITS = 1024
BATCH_SIZE = 32
STEPS_PER_EPOCH = len(data) // BATCH_SIZE
STEPS_TO_LOG = 100
MAX_OUTPUT_LENGTH = 50
EPOCHS = 20
EMB_SIZE = 128
from headliner.trainer import Trainer
from headliner.model.summarizer_attention import SummarizerAttention
summarizer = SummarizerAttention(lstm_size=NUM_UNITS, embedding_size=EMB_SIZE)
trainer = Trainer(batch_size=BATCH_SIZE,
steps_per_epoch=STEPS_PER_EPOCH,
steps_to_log=STEPS_TO_LOG,
max_output_len=MAX_OUTPUT_LENGTH,
model_save_path=save_path)
trainer.train(summarizer, train, num_epochs=EPOCHS, val_data=test)
Loading pre-trained model:
summarizer_loaded = SummarizerAttention.load('summarizer')
trainer = Trainer(batch_size=2)
trainer.train(summarizer_loaded, data)
---------------------------------------------------------------------------
UnknownError Traceback (most recent call last)
<ipython-input-20-1bcb176df5c9> in <module>()
1 summarizer_loaded = SummarizerAttention.load('summarizer')
2 trainer = Trainer(batch_size=2)
----> 3 trainer.train(summarizer_loaded, data)
4 # summarizer_loaded.save('/tmp/summarizer_retrained')
C:\ProgramData\Anaconda3\lib\site-packages\headliner\trainer.py in train(self, summarizer, train_data, val_data, num_epochs, scorers, callbacks)
203 train_step = summarizer.new_train_step(self.loss_function, self.batch_size, apply_gradients=True)
204 while epoch_count < num_epochs:
--> 205 for train_source_seq, train_target_seq in train_dataset.take(-1):
206 batch_count += 1
207 current_loss = train_step(train_source_seq, train_target_seq)
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py in __next__(self)
620
621 def __next__(self): # For Python 3 compatibility
--> 622 return self.next()
623
624 def _next_internal(self):
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py in next(self)
664 """Returns a nested structure of `Tensor`s containing the next element."""
665 try:
--> 666 return self._next_internal()
667 except errors.OutOfRangeError:
668 raise StopIteration
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py in _next_internal(self)
649 self._iterator_resource,
650 output_types=self._flat_output_types,
--> 651 output_shapes=self._flat_output_shapes)
652
653 try:
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\ops\gen_dataset_ops.py in iterator_get_next_sync(iterator, output_types, output_shapes, name)
2670 else:
2671 message = e.message
-> 2672 _six.raise_from(_core._status_to_exception(e.code, message), None)
2673 # Add nodes to the TensorFlow graph.
2674 if not isinstance(output_types, (list, tuple)):
C:\ProgramData\Anaconda3\lib\site-packages\six.py in raise_from(value, from_value)
UnknownError: AttributeError: 'Vectorizer' object has no attribute 'max_input_len'
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\ops\script_ops.py", line 221, in __call__
ret = func(*args)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py", line 585, in generator_py_func
values = next(generator_state.get_iterator(iterator_id))
File "C:\ProgramData\Anaconda3\lib\site-packages\headliner\trainer.py", line 264, in <genexpr>
data_vectorized = (vectorizer(d) for d in data_preprocessed)
File "C:\ProgramData\Anaconda3\lib\site-packages\headliner\preprocessing\vectorizer.py", line 42, in __call__
if self.max_input_len is not None:
AttributeError: 'Vectorizer' object has no attribute 'max_input_len'
[[{{node PyFunc}}]] [Op:IteratorGetNextSync]
When I try to train a SummarizerTransformer
on longer training examples I get the following error: InvalidArgumentError: Incompatible shapes: [1,11,64] vs. [1,8,64]
in train_step
. It looks like it is depending on the length of the targets.
Minimal example:
from headliner.trainer import Trainer
from headliner.model.summarizer_transformer import SummarizerTransformer
data = [
('You are the stars, earth and sky for me!', 'I love you I love you I love you.'),
('You are the stars, earth and sky for me!', 'I love you.')
]
summarizer = SummarizerTransformer(embedding_size=64, max_prediction_len=20)
trainer = Trainer(batch_size=1, steps_per_epoch=100)
trainer.train(summarizer, data, num_epochs=1)
Leads to an InvalidArgumentError
while
from headliner.trainer import Trainer
from headliner.model.summarizer_transformer import SummarizerTransformer
data = [
('You are the stars, earth and sky for me!', 'I love you.'),
('You are the stars, earth and sky for me!', 'I love you.')
]
summarizer = SummarizerTransformer(embedding_size=64, max_prediction_len=20)
trainer = Trainer(batch_size=1, steps_per_epoch=100)
trainer.train(summarizer, data, num_epochs=1)
without the ('You are the stars, earth and sky for me!', 'I love you I love you I love you.')
pair, works fine.
I use:
python==3.6
tensorflow==2.0.0
headliner== 0.0.22
It does not depend on if I run it on a gpu or cpu only.
Can you reproduce this bug?
It is not an issue. It is more like a suggestion.
Is it possible to have google colab hosted notebooks like this in the documentation to play around with code and to run quick demos?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.