Comments (3)
Huh, that is strange for it to exit with no error message. Can you add some print statements to find exactly at which point it exits?
from electra.
I add some print statements in preprocessing.py to see what's going on:
` utils.log("Loading dataset", dataset_name)
n_examples = None
if (self._config.use_tfrecords_if_existing and
tf.io.gfile.exists(metadata_path)):
n_examples = utils.load_json(metadata_path)["n_examples"]
print(" n_examples = utils.load_json(metadata_path)[n_examples]\n")
if n_examples is None:
utils.log("Existing tfrecords not found so creating")
examples = []
print("\n examples = [] \n")
for task in tasks:
print("\ntask:",task,"\n")
task_examples = task.get_examples(split)
print("\n task_examples = task.get_examples(split)\n")
examples += task_examples
print("\nexamples += task_examples\n")
if is_training:
random.shuffle(examples)
print("\n random.shuffle(examples) \n")
utils.mkdir(tfrecords_path.rsplit("/", 1)[0])
print("\n utils.mkdir(tfrecords_path.rsplit("/", 1)[0]) \n")
n_examples = self.serialize_examples(
examples, is_training, tfrecords_path, batch_size)
print("\n n_examples = self.serialize_examples( \n")
utils.write_json({"n_examples": n_examples}, metadata_path)`
The output is like this:
Loading dataset squad_train
n_examples = utils.load_json(metadata_path)[n_examples]
Existing tfrecords not found so creating
examples = []
task: Task(squad)
(env_tf115) D:\python_code\NLP\electra>
It seems that what is not working properly is "task_examples = task.get_examples(split)"
I'm trying to figure it out.
To see what is going on with .get_examples method
I checked qa_tasks.py
I found out that this line is not working:
input_data = json.load(f)["data"]
from electra.
Updated:
In my case
Some code in finetune/qa/qa_tasks.py is not functioning properly.
'with tf.io.gfile.GFile(os.path.join(
self.config.raw_data_dir(self.name),
split + ("-debug" if self.config.debug else "") + ".json"), "r") as f:'
I printed
os.path.join(self.config.raw_data_dir(self.name),split + ("-debug" if self.config.debug else "") + ".json
to see if the path is right.
My squad datadir located "D:\python_code\NLP\electra\datadir\finetuning_data\squad"
,so the squad training set path should be
"D:\python_code\NLP\electra\datadir\finetuning_data\squad\train.json"
But it is actually "squad\train.json"
So I changed this line
self.config.raw_data_dir(self.name)
to my data_dir path.
The program started making tfrecord files, so I think this issue has been solved.
from electra.
Related Issues (20)
- what should i do to extract the electra discriminator HOT 2
- ELECTRA-base fine tuned on MNLI HOT 1
- A possible mistake in the FLOPs calculation of attn_output_layer_norm in the file flops_computation.py
- no module named tensorflow.contrib HOT 1
- Electra Vocabulary HOT 1
- some confusions about paper HOT 1
- Question regarding TrainingsData/Validation Data split
- NumPy Import Error HOT 2
- Train electra with another tokenizer HOT 2
- About the Electra paper
- Optimal Learning Rate and Training Steps for Large Batch Size
- How can I draw this? HOT 1
- Tagging Task Segment ids
- Cannot import trace from tensorflow.python.profiler HOT 4
- sequence tagging tasks fails at metric reporting HOT 1
- Can I used run_mlm.py in transformer for fine-tuning generator(mlm) of electra
- failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED HOT 1
- finetune preprocessing adding padding to the dataset error HOT 1
- How many parameters do discriminator and generator have?
- What is the maximum acceptance for the sentence length for the ELECTRA model?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from electra.