I have trained 200 epochs with ReCoSa model and the result as the following: <p di

I have trained enough epochs but still get a terrible result about recosa HOT 11 CLOSED

zhanghainan commented on September 27, 2024

I have trained enough epochs but still get a terrible result

from recosa.

Comments (11)

zhanghainan commented on September 27, 2024

No, you could run the model for more time.

from recosa.

BBLoatheb commented on September 27, 2024

thank you for your response, but also i find that the loss of the train data don not decrease efficiently.

0%| | 0/1135 [00:00<?, ?b/s]�[Atrain loss:5.54259

Exception in thread Thread-102:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/home/admin/.local/lib/python3.6/site-packages/tqdm/_monitor.py", line 62, in run
for instance in self.tqdm_cls._instances:
File "/usr/lib/python3.6/_weakrefset.py", line 60, in iter
for itemref in self.data:
RuntimeError: Set changed size during iteration

WARNING:tensorflow:Issue encountered when serializing global_step.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'Tensor' object has no attribute 'to_proto'
eval loss:5.62934
Bleu Score:2.17809

0%| | 0/1135 [00:00<?, ?b/s]train loss:5.40310

0%| | 0/1135 [00:00<?, ?b/s]�[Atrain loss:5.75812

And it also gets some warning, could this be the problem which causes the model does't perform well

from recosa.

zhanghainan commented on September 27, 2024

I haven't faced with this problem. I think you could try more time to see the results. 200 epoch is not enought. You could check the output during this time to see if it is changed to be long and different.

from recosa.

BBLoatheb commented on September 27, 2024

thank you. now i have 40000 as train data, how many epochs do you think is enough?

from recosa.

zhanghainan commented on September 27, 2024

Maybe 10000 to see the results. You could try a small dataset as 100 pairs to see whether the model is running correctly.

from recosa.

sonyawong commented on September 27, 2024

Maybe 10000 to see the results. You could try a small dataset as 100 pairs to see whether the model is running correctly.

Hi, I have the same issue with BBLoatheb . I have trained 500 epochs on JD dataset which contains 500,000 pairs. But the results are always the same responses. Should I try to train more times? It taked about 5 days to run the 500 epochs in my experiment.

from recosa.

BBLoatheb commented on September 27, 2024

I have tried a smaller dataset (100 pairs) about 200 epochs, this time the model performs well. Maybe just train it for more times could be help. But you taked about 5 days to run the 500 epochs，as this speed，you may take 100 days to run 10000 epochs, it sounds terrible.

from recosa.

sonyawong commented on September 27, 2024

I have tried a smaller dataset (100 pairs) about 200 epochs, this time the model performs well. Maybe just train it for more times could be help. But you taked about 5 days to run the 500 epochs，as this speed，you may take 100 days to run 10000 epochs, it sounds terrible.

Yep. So I was wondering whether I run the model right or not. Did you run the codes provided by the author, and how long did it take in your previous training(40,000 pairs, 200 epochs)?

from recosa.

BBLoatheb commented on September 27, 2024

I have tried a smaller dataset (100 pairs) about 200 epochs, this time the model performs well. Maybe just train it for more times could be help. But you taked about 5 days to run the 500 epochs，as this speed，you may take 100 days to run 10000 epochs, it sounds terrible.

Yep. So I was wondering whether I run the model right or not. Did you run the codes provided by the author, and how long did it take in your previous training(40,000 pairs, 200 epochs)?

yes, it takes about one day.

from recosa.

sonyawong commented on September 27, 2024

I have tried a smaller dataset (100 pairs) about 200 epochs, this time the model performs well. Maybe just train it for more times could be help. But you taked about 5 days to run the 500 epochs，as this speed，you may take 100 days to run 10000 epochs, it sounds terrible.

Yep. So I was wondering whether I run the model right or not. Did you run the codes provided by the author, and how long did it take in your previous training(40,000 pairs, 200 epochs)?

yes, it takes about one day.

okey, I would try a smaller dataset to see the result. Thank you so much~

from recosa.

gmftbyGMFTBY commented on September 27, 2024

But, I think why the ReCoSa needs so much time to converge is still a fatal question.
@BBLoatheb , don't you think that training 200 epochs on 100 pairs will lead to the overfitting?
Did you check the loss curve on the dev dataset?

from recosa.

I have trained enough epochs but still get a terrible result about recosa HOT 11 CLOSED

Comments (11)

Related Issues (13)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs