Comments (12)
I have't run this code for a while. It is quite senstitive to seeds. Please try with few others. If it doesn't work then I will help.
from algorithm-learning.
Sorry for the slow reply; I was away from my computer most of the day. I've trained models with random seeds 1-7 for 10 million characters, and none of the models have been able to progress past the initial difficulty (complexity 6, plateaus at ~70% per-character accuracy). I'm still training them, so I'll update if anything changes before 30 million, but that seems fairly unlikely.
The current configuration it's printing out looks like this, in case anything looks unusual:
{
seed : 8
batch_size : 20
q_type : "q_watkins"
vocab_size : 24
max_seq_length : 50
dim : 2
max_grad_norm : 5
input_size : 61
q_discount : -1
unit : "gru"
ntasks : 6
rnn_size : 200
test_len : 200
train : 2
seq_length : 11
q_lr : 0.1
layers : 1
task : "addition"
lr : 0.1
}
Let me know if there's anything I can do to help look into this
from algorithm-learning.
It looks reasonable. It suppose to work of the box. I haven't run this code
for a while.
Could you try first on simpler task, please ?
On Sun, Oct 30, 2016 at 10:52 PM Jonathan Uesato [email protected]
wrote:
Sorry for the slow reply; I was away from my computer most of the day.
I've trained models with random seeds 1-7 for 10 million characters, and
none of the models have been able to progress past the initial difficulty
(complexity 6, plateaus at ~70% per-character accuracy). I'm still training
them, so I'll update if anything changes before 30 million, but that seems
fairly unlikely.The current configuration it's printing out looks like this, in case
anything looks unusual:{
seed : 8
batch_size : 20
q_type : "q_watkins"
vocab_size : 24
max_seq_length : 50
dim : 2
max_grad_norm : 5
input_size : 61
q_discount : -1
unit : "gru"
ntasks : 6
rnn_size : 200
test_len : 200
train : 2
seq_length : 11
q_lr : 0.1
layers : 1
task : "addition"
lr : 0.1
}Let me know if there's anything I can do to help look into this
—
You are receiving this because you commented.Reply to this email directly, view it on GitHub
#2 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABOhgRLzI_xlg2tYyOgzaa7RVMQrchKxks5q5YIzgaJpZM4KkcFB
.
from algorithm-learning.
Sure thing - I'm starting a couple processes training on walk and reverse now.
I'm getting an error on the copy task, which I'll look into: /algorithm-learning/utils/Game.lua:211: attempt to index field 'dir' (a nil value)
from algorithm-learning.
Reverse works only with feed forward networks, not with a recurrent one.
I will help you to reproduce results. It might be needed to pull older
torch. Maybe default options for initialization changed or so.
On Sun, Oct 30, 2016 at 11:35 PM Jonathan Uesato [email protected]
wrote:
Sure thing - I'm starting a couple processes training on walk and reverse
now.I'm getting an error on the copy task, which I'll look into: /algorithm-learning/utils/Game.lua:211:
attempt to index field 'dir' (a nil value)—
You are receiving this because you commented.Reply to this email directly, view it on GitHub
#2 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABOhgbGZM3KpbjWY4-msXC0Ge_O2Dh6rks5q5YwngaJpZM4KkcFB
.
from algorithm-learning.
I changed main.lua
from
if params.task == "reverse" or
params.task == "ident" then
params.dim = 1
else
to
if params.task == "reverse" or
params.task == "copy" then
params.dim = 1
else
which fixes the nil value error.
I'm training two models each for copy, reverse, and walk now
from algorithm-learning.
Got it, using the feedforward parameter for the reverse task.
Thanks a bunch for your help - I really appreciate it. Different default initializations would make sense to me.
Edit: The models have been unable to make progress on the simpler tasks as well after 10m characters. I'll keep them running overnight, but I'd be surprised if they improve.
from algorithm-learning.
@wojzaremba Is there anything I can do to help with this? If you don't have time until after the ICLR deadline, I totally understand that. Also, if there's experiments it'd be helpful for me to write/run, I can do that as well.
from algorithm-learning.
Are you able to solve any task ? Does it work for copy, reverse, multiplication ? I was giving 200 minutes wall time. However, model was solving tasks way faster.
Initialization might be the issue :( If things doesn't work, I will help to reproduce results.
from algorithm-learning.
I wasn't able to get the model to train (used fixed seeds 1-7, fairly recent install of torch, ran models overnight) successfully. Likely some issue with initialization (they seem to learn *something - certainly better than random, but they're never getting past 80% even on the length 6 tasks).
I was experimenting with these types of models last week, but I've started working on other ideas now, so I don't plan to run any more experiments. I don't want to suck up your time, so I can close this issue, and if I do come back to it, I'll let you know. Is this a reasonable plan to you?
from algorithm-learning.
Yeah. I am feeling little bit bad. You should be able to reproduce results.
At least copy should work.
On Mon, Nov 7, 2016 at 9:52 PM Jonathan Uesato [email protected]
wrote:
I wasn't able to get the model to train (used fixed seeds 1-7, fairly
recent install of torch, ran models overnight) successfully. Likely some
issue with initialization (they seem to learn *something - certainly better
than random, but they're never getting past 80% even on the length 6 tasks).I was experimenting with these types of models last week, but I've started
working on other ideas now, so I don't plan to run any more experiments. I
don't want to suck up your time, so I can close this issue, and if I do
come back to it, I'll let you know. Is this a reasonable plan to you?—
You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub
#2 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABOhgTCX0xetJi1vAgA-2zpt0e2hnt-mks5q8A4VgaJpZM4KkcFB
.
from algorithm-learning.
Hmm I didn't try very hard to make things work (the only thing I've done is change seed values). I believe that with a couple tweaks it will work, but I don't really know what I'd try (maybe installing old versions of nn
package?), and I'm not dependent on this right now, so it's fine for me.
Basically, I'm not planning to investigate this, but if there's anything you want me to try, kicking off experiments and leaving them overnight is easy for me and I'm happy to do that.
from algorithm-learning.
Related Issues (5)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from algorithm-learning.