Can you provide train.tf_record file? N

I have provided the code to generate the tf records file </blockquote

train.tf_record not found about abstractive-summarization-with-transfer-learning HOT 5 CLOSED

santhoshkolloju commented on June 6, 2024

train.tf_record not found

from abstractive-summarization-with-transfer-learning.

Comments (5)

santhoshkolloju commented on June 6, 2024 1

give me few days i will write a detailed post how you can run on your data

from abstractive-summarization-with-transfer-learning.

santhoshkolloju commented on June 6, 2024

I have provided the code to generate the tf records file

from abstractive-summarization-with-transfer-learning.

Vibha111094 commented on June 6, 2024

Do we need to create an empty file gs://bert_summ/train.tf_record and then call the function 'file_based_convert_examples_to_features' ?

from abstractive-summarization-with-transfer-learning.

callzhang commented on June 6, 2024

I have provided the code to generate the tf records file

Is this the right way?

def get_dataset(processor,
                tokenizer,
                data_dir,
                max_seq_length_src,
                max_seq_length_tgt,
                batch_size,
                mode,
                output_dir,
                is_distributed=False):
    """
    Args:
        processor: Data Preprocessor, must have get_lables,
            get_train/dev/test/examples methods defined.
        tokenizer: The Sentence Tokenizer. Generally should be
            SentencePiece Model.
        data_dir: The input data directory.
        max_seq_length: Max sequence length.
        batch_size: mini-batch size.
        model: `train`, `eval` or `test`.
        output_dir: The directory to save the TFRecords in.
    """
    #label_list = processor.get_labels()
    if mode == 'train':
        train_examples = processor.get_train_examples(data_dir)
        #train_file = os.path.join(output_dir, "train.tf_record")
        train_file = "gs://bert_summarization/train.tf_record"
        file_based_convert_examples_to_features(
           train_examples, max_seq_length_src,max_seq_length_tgt,
           tokenizer, train_file)
        dataset = file_based_input_fn_builder(
            input_file=train_file,
            max_seq_length_src=max_seq_length_src,
            max_seq_length_tgt =max_seq_length_tgt,
            is_training=True,
            drop_remainder=True,
            is_distributed=is_distributed)({'batch_size': batch_size})
    elif mode == 'eval':
        eval_examples = processor.get_dev_examples(data_dir)
        #eval_file = os.path.join(output_dir, "eval.tf_record")
        eval_file = "gs://bert_summarization/eval.tf_record"
        file_based_convert_examples_to_features(
           eval_examples, max_seq_length_src,max_seq_length_tgt,
           tokenizer, eval_file)
        dataset = file_based_input_fn_builder(
            input_file=eval_file,
            max_seq_length_src=max_seq_length_src,
            max_seq_length_tgt =max_seq_length_tgt,
            is_training=False,
            drop_remainder=True,
            is_distributed=is_distributed)({'batch_size': batch_size})
    elif mode == 'test':
      
        test_examples = processor.get_test_examples(data_dir)
        #test_file = os.path.join(output_dir, "predict.tf_record")
        test_file = "gs://bert_summarization/predict.tf_record"
        
        file_based_convert_examples_to_features(
           test_examples, max_seq_length_src,max_seq_length_tgt,
           tokenizer, test_file)
        dataset = file_based_input_fn_builder(
            input_file=test_file,
            max_seq_length_src=max_seq_length_src,
            max_seq_length_tgt =max_seq_length_tgt,
            is_training=False,
            drop_remainder=True,
            is_distributed=is_distributed)({'batch_size': batch_size})
    return dataset

from abstractive-summarization-with-transfer-learning.

callzhang commented on June 6, 2024

give me few days i will write a detailed post how you can run on your data

That will be awesome. I have gone ahead started training with the code modification mentioned above. I would love to read your post and learn more details about it.

from abstractive-summarization-with-transfer-learning.

train.tf_record not found about abstractive-summarization-with-transfer-learning HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs