GithubHelp home page GithubHelp logo

Comments (3)

clarkkev avatar clarkkev commented on June 22, 2024

Huh, that is strange for it to exit with no error message. Can you add some print statements to find exactly at which point it exits?

from electra.

curtis0982 avatar curtis0982 commented on June 22, 2024

I add some print statements in preprocessing.py to see what's going on:
` utils.log("Loading dataset", dataset_name)

n_examples = None

if (self._config.use_tfrecords_if_existing and

    tf.io.gfile.exists(metadata_path)):

  n_examples = utils.load_json(metadata_path)["n_examples"]

print("      n_examples = utils.load_json(metadata_path)[n_examples]\n")

if n_examples is None:

  utils.log("Existing tfrecords not found so creating")

  examples = []

  print("\n examples = [] \n")

  for task in tasks:
    print("\ntask:",task,"\n")

    task_examples = task.get_examples(split)

    print("\n task_examples = task.get_examples(split)\n")

    examples += task_examples

    print("\nexamples += task_examples\n")

  if is_training:
    random.shuffle(examples)

    print("\n random.shuffle(examples) \n")

  utils.mkdir(tfrecords_path.rsplit("/", 1)[0])

  print("\n utils.mkdir(tfrecords_path.rsplit("/", 1)[0]) \n")

  n_examples = self.serialize_examples(
      examples, is_training, tfrecords_path, batch_size)

  print("\n n_examples = self.serialize_examples( \n")

  utils.write_json({"n_examples": n_examples}, metadata_path)`

The output is like this:
Loading dataset squad_train
n_examples = utils.load_json(metadata_path)[n_examples]

Existing tfrecords not found so creating

examples = []

task: Task(squad)

(env_tf115) D:\python_code\NLP\electra>

It seems that what is not working properly is "task_examples = task.get_examples(split)"
I'm trying to figure it out.

To see what is going on with .get_examples method
I checked qa_tasks.py
I found out that this line is not working:
input_data = json.load(f)["data"]

from electra.

curtis0982 avatar curtis0982 commented on June 22, 2024

Updated:
In my case
Some code in finetune/qa/qa_tasks.py is not functioning properly.
'with tf.io.gfile.GFile(os.path.join(
self.config.raw_data_dir(self.name),
split + ("-debug" if self.config.debug else "") + ".json"), "r") as f:'
I printed
os.path.join(self.config.raw_data_dir(self.name),split + ("-debug" if self.config.debug else "") + ".json
to see if the path is right.

My squad datadir located "D:\python_code\NLP\electra\datadir\finetuning_data\squad"
,so the squad training set path should be
"D:\python_code\NLP\electra\datadir\finetuning_data\squad\train.json"
But it is actually "squad\train.json"
So I changed this line
self.config.raw_data_dir(self.name)
to my data_dir path.
The program started making tfrecord files, so I think this issue has been solved.

from electra.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.