GithubHelp home page GithubHelp logo

as-ideas / headliner Goto Github PK

View Code? Open in Web Editor NEW
229.0 229.0 41.0 2.78 MB

๐Ÿ– Easy training and deployment of seq2seq models.

Home Page: https://as-ideas.github.io/headliner/

License: Other

Python 99.99% Shell 0.01%
neural-network nlp python seq2seq tensorflow

headliner's People

Contributors

cschaefer26 avatar datitran avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

headliner's Issues

does headliner support beam search during decoding?

hello,i want to kown does headliner supper beam search way when pridection,if it support, how can i use it,the document dont mention it, and when i check the source code, it seems you use the greedy search during decoding.
many thanks.

Question: Pre-trained BERT for MT

Hi,

has there been any research done from your side comparing BLEU score to SOTA results for this pre-trained BERT for MT?
Or is it merely to illustrate the flexibility of the library, e.g, attaching a decoder and re-training on a custom dataset?

Import library

Hi,

Thank you so much for sharing this library. It seems to be very easy to use compared to others.

I try to import the library to reproduce your example but I receive an error message :

File "/home/ubuntu/.local/lib/python3.5/site-packages/headliner/model/summarizer.py", line 14
self.vectorizer: Union[Vectorizer, None] = None
^
SyntaxError: invalid syntax

I am running Python 3.5.2.

Am I doing something wrong?

Thanks

Trainig with custom AmazonFoodReview Dataset for TextSummarization

Hi,
First of all thanx for bringing such an easy to use Sequence-to-Sequence NN to open source.
Actually I was thinking to use HeadLiner, and for testing I started Training with custom data of AmazonFoodReviews for a summarization model. But it ended up with a loss of "4.662851969401042".

I was using TransformerSummarizer to train the custom model and the code is here below.

summarizer = TransformerSummarizer(num_heads=1,embedding_size=64, max_prediction_len=20) trainer = Trainer(batch_size=2, steps_per_epoch=100) trainer.train(summarizer, training_data, num_epochs=100) summarizer.save('/tmp/summarizer')

Training data was in form of tuple (only 10 samples to print here) :
[('have bought several of the vitality canned dog food products and have found them all to be of good quality the product looks more like stew than processed meat and it smells better my labrador is finicky and she appreciates this product better than most', 'good quality dog food'), ('product arrived labeled as jumbo salted peanuts the peanuts were actually small sized unsalted not sure if this was an error or if the vendor intended to represent the product as jumbo', 'not as advertised'), ('this is confection that has been around few centuries it is light pillowy citrus gelatin with nuts in this case filberts and it is cut into tiny squares and then liberally coated with powdered sugar and it is tiny mouthful of heaven not too chewy and very flavorful highly recommend this yummy treat if you are familiar with the story of lewis the lion the witch and the wardrobe this is the treat that seduces edmund into selling out his brother and sisters to the witch', 'delight says it all'), ('if you are looking for the secret ingredient in robitussin believe have found it got this in addition to the root beer extract ordered and made some cherry soda the flavor is very medicinal', 'cough medicine'), ('great taffy at great price there was wide assortment of yummy taffy delivery was very quick if your taffy lover this is deal', 'great taffy'), ('got wild hair for taffy and ordered this five pound bag the taffy was all very enjoyable with many flavors watermelon root beer melon peppermint grape etc my only complaint is there was bit too much red black licorice flavored pieces between me my kids and my husband this lasted only two weeks would recommend this brand of taffy it was delightful treat', 'nice taffy'), ('this saltwater taffy had great flavors and was very soft and chewy each candy was individually wrapped well none of the candies were stuck together which did happen in the expensive version fralinger would highly recommend this candy served it at beach themed party and everyone loved it', 'great just as good as the expensive brands'), ('this taffy is so good it is very soft and chewy the flavors are amazing would definitely recommend you buying it very satisfying', 'wonderful tasty taffy'), ('right now am mostly just sprouting this so my cats can eat the grass they love it rotate it around with wheatgrass and rye too', 'yay barley'), ('this is very healthy dog food good for their digestion also good for small puppies my dog eats her required amount at every feeding', 'healthy dog food')]

vocab encoder: 18122, vocab decoder: 4439

But the prediction failed badly.

Then I started to give a try to BertSummarizer on same dataset.
Then it gives the error : TypeError: generator yielded an element that did not match the expected structure. The expected structure was (tf.int32, tf.int32, tf.int32), but the yielded element was ([3, 7293, 1725, 14131, 10785, 16089, 17337, 2220, 4703, 6185, 12287, 574, 7293, 6281, 16104, 414, 16352, 1242, 10785, 6793, 12569, 16089, 12274, 9204, 10148, 9029, 15200, 16066, 12257, 9667, 574, 8317, 14571, 1408, 10321, 8751, 8290, 5960, 574, 14182, 736, 16193, 12274, 1408, 16066, 10171, 2], [3, 1655, 3098, 1136, 1500, 2])..

I would be very thankful if you help me out.

Thanx.

Bert Model prediction giving same output

Hi @cschaefer26 and @datitran ,

Thank you so much for the headliner library, I have been playing around it for a while and so far really enjoyed it.

I was working on a use case for summarization and use headliner's bert model, I followed the following readme code (with few tweaks for SummarizerTransformer parameters) for headliner with my use case train_data:

from headliner.preprocessing import Preprocessor

train_data = [('Some inputs.', 'Some outputs.')] * 10

# use BERT-specific start and end token
preprocessor = Preprocessor(start_token='[CLS]',
                            end_token='[SEP]',
                            lower_case=True)
train_prep = [preprocessor(t) for t in train_data]
targets_prep = [t[1] for t in train_prep]


from tensorflow_datasets.core.features.text import SubwordTextEncoder
from transformers import BertTokenizer
from headliner.model import SummarizerBert

# Use a pre-trained BERT embedding and BERT tokenizer for the encoder 
tokenizer_input = BertTokenizer.from_pretrained('bert-base-uncased')
tokenizer_target = SubwordTextEncoder.build_from_corpus(
    targets_prep, target_vocab_size=2**13,  reserved_tokens=[preprocessor.start_token, preprocessor.end_token])

vectorizer = Vectorizer(tokenizer_input, tokenizer_target)
summarizer = SummarizerBert(num_heads=4,
                            feed_forward_dim=512,
                            num_layers_encoder=3,
                            num_layers_decoder=3,
                            bert_embedding_encoder='bert-base-uncased',
                            embedding_encoder_trainable=False,
                            embedding_size_encoder=768,
                            embedding_size_decoder=64,
                            dropout_rate=0.1,
                            max_prediction_len=400)
)
summarizer.init_model(preprocessor, vectorizer)

trainer = Trainer(batch_size=2)
trainer.train(summarizer, train_data, num_epochs=200)

I train the model for 200 epochs and i see in logs that the loss keeps on reducing (starting from around 4 to 0.69), so seems like training happens just fine.

After that when i try to do prediction on the saved model using following code, it gives me the same prediction always for any test_sentence :

from headliner.model.summarizer_bert import SummarizerBert

summarizer = SummarizerBert.load('/path/to/headliner_bert_model')
summarizer.predict(test_sentence)

Please if anyone can advise if am missing anything? with respect to prediction part for bert model?

Improve tokenization

  • use SubwordTokenkzer
  • make iterative fit to data (currently data is loaded into memory)

Error while loading trained model

I followed the documentation and got error for the following code:

Training part

NUM_UNITS = 1024
BATCH_SIZE = 32
STEPS_PER_EPOCH = len(data) // BATCH_SIZE
STEPS_TO_LOG = 100
MAX_OUTPUT_LENGTH = 50
EPOCHS = 20
EMB_SIZE = 128

from headliner.trainer import Trainer
from headliner.model.summarizer_attention import SummarizerAttention

summarizer = SummarizerAttention(lstm_size=NUM_UNITS, embedding_size=EMB_SIZE)
trainer = Trainer(batch_size=BATCH_SIZE, 
                  steps_per_epoch=STEPS_PER_EPOCH, 
                  steps_to_log=STEPS_TO_LOG, 
                  max_output_len=MAX_OUTPUT_LENGTH, 
                  model_save_path=save_path)
trainer.train(summarizer, train, num_epochs=EPOCHS, val_data=test)

Loading pre-trained model:

summarizer_loaded = SummarizerAttention.load('summarizer')
trainer = Trainer(batch_size=2)
trainer.train(summarizer_loaded, data)
---------------------------------------------------------------------------
UnknownError                              Traceback (most recent call last)
<ipython-input-20-1bcb176df5c9> in <module>()
      1 summarizer_loaded = SummarizerAttention.load('summarizer')
      2 trainer = Trainer(batch_size=2)
----> 3 trainer.train(summarizer_loaded, data)
      4 # summarizer_loaded.save('/tmp/summarizer_retrained')

C:\ProgramData\Anaconda3\lib\site-packages\headliner\trainer.py in train(self, summarizer, train_data, val_data, num_epochs, scorers, callbacks)
    203         train_step = summarizer.new_train_step(self.loss_function, self.batch_size, apply_gradients=True)
    204         while epoch_count < num_epochs:
--> 205             for train_source_seq, train_target_seq in train_dataset.take(-1):
    206                 batch_count += 1
    207                 current_loss = train_step(train_source_seq, train_target_seq)

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py in __next__(self)
    620 
    621   def __next__(self):  # For Python 3 compatibility
--> 622     return self.next()
    623 
    624   def _next_internal(self):

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py in next(self)
    664     """Returns a nested structure of `Tensor`s containing the next element."""
    665     try:
--> 666       return self._next_internal()
    667     except errors.OutOfRangeError:
    668       raise StopIteration

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py in _next_internal(self)
    649             self._iterator_resource,
    650             output_types=self._flat_output_types,
--> 651             output_shapes=self._flat_output_shapes)
    652 
    653       try:

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\ops\gen_dataset_ops.py in iterator_get_next_sync(iterator, output_types, output_shapes, name)
   2670       else:
   2671         message = e.message
-> 2672       _six.raise_from(_core._status_to_exception(e.code, message), None)
   2673   # Add nodes to the TensorFlow graph.
   2674   if not isinstance(output_types, (list, tuple)):

C:\ProgramData\Anaconda3\lib\site-packages\six.py in raise_from(value, from_value)

UnknownError: AttributeError: 'Vectorizer' object has no attribute 'max_input_len'
Traceback (most recent call last):

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\ops\script_ops.py", line 221, in __call__
    ret = func(*args)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py", line 585, in generator_py_func
    values = next(generator_state.get_iterator(iterator_id))

  File "C:\ProgramData\Anaconda3\lib\site-packages\headliner\trainer.py", line 264, in <genexpr>
    data_vectorized = (vectorizer(d) for d in data_preprocessed)

  File "C:\ProgramData\Anaconda3\lib\site-packages\headliner\preprocessing\vectorizer.py", line 42, in __call__
    if self.max_input_len is not None:

AttributeError: 'Vectorizer' object has no attribute 'max_input_len'


	 [[{{node PyFunc}}]] [Op:IteratorGetNextSync]

Training with longer examples leads to `InvalidArgumentError`

When I try to train a SummarizerTransformer on longer training examples I get the following error: InvalidArgumentError: Incompatible shapes: [1,11,64] vs. [1,8,64] in train_step. It looks like it is depending on the length of the targets.

Minimal example:

from headliner.trainer import Trainer
from headliner.model.summarizer_transformer import SummarizerTransformer

data = [
        ('You are the stars, earth and sky for me!', 'I love you I love you I love you.'),
        ('You are the stars, earth and sky for me!', 'I love you.')
]

summarizer = SummarizerTransformer(embedding_size=64, max_prediction_len=20)
trainer = Trainer(batch_size=1, steps_per_epoch=100)
trainer.train(summarizer, data, num_epochs=1)

Leads to an InvalidArgumentError while

from headliner.trainer import Trainer
from headliner.model.summarizer_transformer import SummarizerTransformer

data = [
        ('You are the stars, earth and sky for me!', 'I love you.'),
        ('You are the stars, earth and sky for me!', 'I love you.')
]

summarizer = SummarizerTransformer(embedding_size=64, max_prediction_len=20)
trainer = Trainer(batch_size=1, steps_per_epoch=100)
trainer.train(summarizer, data, num_epochs=1)

without the ('You are the stars, earth and sky for me!', 'I love you I love you I love you.') pair, works fine.

I use:

python==3.6
tensorflow==2.0.0
headliner== 0.0.22

It does not depend on if I run it on a gpu or cpu only.

Can you reproduce this bug?

Colab hosted notebooks for quick demo.

It is not an issue. It is more like a suggestion.

Is it possible to have google colab hosted notebooks like this in the documentation to play around with code and to run quick demos?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.