Comments (11)
No, you could run the model for more time.
from recosa.
thank you for your response, but also i find that the loss of the train data don not decrease efficiently.
0%| | 0/1135 [00:00<?, ?b/s]�[Atrain loss:5.54259
Exception in thread Thread-102:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/home/admin/.local/lib/python3.6/site-packages/tqdm/_monitor.py", line 62, in run
for instance in self.tqdm_cls._instances:
File "/usr/lib/python3.6/_weakrefset.py", line 60, in iter
for itemref in self.data:
RuntimeError: Set changed size during iteration
WARNING:tensorflow:Issue encountered when serializing global_step.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'Tensor' object has no attribute 'to_proto'
eval loss:5.62934
Bleu Score:2.17809
0%| | 0/1135 [00:00<?, ?b/s]train loss:5.40310
WARNING:tensorflow:Issue encountered when serializing global_step.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'Tensor' object has no attribute 'to_proto'
eval loss:5.62677
Bleu Score:2.22458
0%| | 0/1135 [00:00<?, ?b/s]�[Atrain loss:5.75812
And it also gets some warning, could this be the problem which causes the model does't perform well
from recosa.
I haven't faced with this problem. I think you could try more time to see the results. 200 epoch is not enought. You could check the output during this time to see if it is changed to be long and different.
from recosa.
thank you. now i have 40000 as train data, how many epochs do you think is enough?
from recosa.
Maybe 10000 to see the results. You could try a small dataset as 100 pairs to see whether the model is running correctly.
from recosa.
Maybe 10000 to see the results. You could try a small dataset as 100 pairs to see whether the model is running correctly.
Hi, I have the same issue with BBLoatheb . I have trained 500 epochs on JD dataset which contains 500,000 pairs. But the results are always the same responses. Should I try to train more times? It taked about 5 days to run the 500 epochs in my experiment.
from recosa.
I have tried a smaller dataset (100 pairs) about 200 epochs, this time the model performs well. Maybe just train it for more times could be help. But you taked about 5 days to run the 500 epochs,as this speed,you may take 100 days to run 10000 epochs, it sounds terrible.
from recosa.
I have tried a smaller dataset (100 pairs) about 200 epochs, this time the model performs well. Maybe just train it for more times could be help. But you taked about 5 days to run the 500 epochs,as this speed,you may take 100 days to run 10000 epochs, it sounds terrible.
Yep. So I was wondering whether I run the model right or not. Did you run the codes provided by the author, and how long did it take in your previous training(40,000 pairs, 200 epochs)?
from recosa.
I have tried a smaller dataset (100 pairs) about 200 epochs, this time the model performs well. Maybe just train it for more times could be help. But you taked about 5 days to run the 500 epochs,as this speed,you may take 100 days to run 10000 epochs, it sounds terrible.
Yep. So I was wondering whether I run the model right or not. Did you run the codes provided by the author, and how long did it take in your previous training(40,000 pairs, 200 epochs)?
yes, it takes about one day.
from recosa.
I have tried a smaller dataset (100 pairs) about 200 epochs, this time the model performs well. Maybe just train it for more times could be help. But you taked about 5 days to run the 500 epochs,as this speed,you may take 100 days to run 10000 epochs, it sounds terrible.
Yep. So I was wondering whether I run the model right or not. Did you run the codes provided by the author, and how long did it take in your previous training(40,000 pairs, 200 epochs)?
yes, it takes about one day.
okey, I would try a smaller dataset to see the result. Thank you so much~
from recosa.
But, I think why the ReCoSa needs so much time to converge is still a fatal question.
@BBLoatheb , don't you think that training 200 epochs on 100 pairs will lead to the overfitting?
Did you check the loss curve on the dev dataset?
from recosa.
Related Issues (13)
- Want Code HOT 1
- 和transformer的差别 HOT 1
- 请问论文中的中文数据集JDC公开嘛? HOT 1
- 能公开一下处理Ubuntu数据的代码吗? HOT 13
- 你好,请问能让我看一下数据集的格式吗?
- how to obtain the jd dataset HOT 2
- 请问有JDC数据集的链接吗 HOT 3
- Why you convert the whole dataset to tensor? HOT 4
- Run the code on DailyDialog but have terrible result HOT 26
- 上下文输入 HOT 1
- Question about the Transformer Decoder HOT 11
- Code for Evaluation Measures (PPL & distinct-1 & distinct-2) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from recosa.