Comments (12)
Fantastic -- thank you. I'm running the cifar10_micro_search.sh
now, and will post here to confirm once I get some results.
~ Ben
from enas.
# Exp. 1
./scripts/ptb_search.sh # should give you a bunch of architectures
./scripts/ptb_final.sh # should give you around 55.8 test perplexity on PTB
# Exp. 2
./scripts/cifar10_macro_search.sh # should give you a bunch of architectures
./scripts/cifar10_macro_final.sh # should give you around 96.1% accuracy on the test set
# Exp 3.
./scripts/cifar10_micro_search.sh # should give you a bunch of architectures
./scripts/cifar10_micro_final.sh # should give you around 96.5% accuracy on the test set
from enas.
OK -- tail of cifar10_micro_search.sh
looks like:
Eval at 42018
valid_accuracy: 0.6820
Eval at 42018
test_accuracy: 0.6636
epoch=149 ch_step=42050 loss=0.910298 lr=0.0005 |g|=2.4888 tr_acc=105/160 mins=717.36
epoch=149 ch_step=42100 loss=1.008317 lr=0.0005 |g|=3.0906 tr_acc=110/160 mins=717.86
epoch=149 ch_step=42150 loss=0.833895 lr=0.0005 |g|=2.0674 tr_acc=107/160 mins=718.36
epoch=149 ch_step=42200 loss=0.951047 lr=0.0005 |g|=2.4366 tr_acc=104/160 mins=718.85
epoch=149 ch_step=42250 loss=0.930920 lr=0.0005 |g|=2.1964 tr_acc=107/160 mins=719.35
epoch=150 ch_step=42300 loss=0.993480 lr=0.0005 |g|=2.3855 tr_acc=98 /160 mins=719.85
Epoch 150: Training controller
ctrl_step=4470 loss=3.077 ent=53.16 lr=0.0035 |g|=0.0440 acc=0.7375 bl=0.68 mins=719.85
ctrl_step=4475 loss=1.252 ent=53.17 lr=0.0035 |g|=0.0088 acc=0.7000 bl=0.68 mins=720.05
ctrl_step=4480 loss=2.096 ent=53.14 lr=0.0035 |g|=0.0490 acc=0.7188 bl=0.68 mins=720.26
ctrl_step=4485 loss=3.848 ent=53.13 lr=0.0035 |g|=0.0474 acc=0.7625 bl=0.69 mins=720.46
ctrl_step=4490 loss=2.683 ent=53.13 lr=0.0035 |g|=0.1009 acc=0.7375 bl=0.69 mins=720.66
ctrl_step=4495 loss=-5.616 ent=53.09 lr=0.0035 |g|=0.1095 acc=0.5750 bl=0.69 mins=720.86
Here are 10 architectures
[0 1 1 0 1 1 0 3 1 2 0 0 0 1 1 4 1 0 3 1]
[0 1 0 1 1 0 1 1 0 2 2 3 2 1 3 4 4 0 5 1]
val_acc=0.6375
--------------------------------------------------------------------------------
[0 1 1 1 0 2 0 1 1 0 1 3 0 1 1 4 0 1 0 4]
[0 2 0 4 0 0 0 3 0 2 3 4 0 0 2 1 5 1 2 1]
val_acc=0.6313
--------------------------------------------------------------------------------
[0 2 0 0 1 1 1 1 1 2 0 0 0 0 1 0 1 3 0 0]
[0 1 0 4 0 1 1 2 0 2 3 4 1 0 3 4 5 4 4 1]
val_acc=0.7000
--------------------------------------------------------------------------------
[1 2 1 2 0 1 1 1 0 0 1 1 1 0 1 0 1 1 0 4]
[1 1 0 2 2 0 2 4 3 1 3 1 4 4 4 0 2 3 0 0]
val_acc=0.6875
--------------------------------------------------------------------------------
[1 2 0 2 1 4 1 2 1 3 0 4 0 1 2 4 0 4 1 4]
[1 0 1 0 0 2 0 2 1 4 2 4 4 3 1 0 5 0 5 3]
val_acc=0.6500
--------------------------------------------------------------------------------
[1 3 1 0 1 4 0 1 2 1 1 4 1 4 1 1 0 1 1 1]
[1 1 0 0 2 1 1 4 0 0 1 4 2 1 0 0 5 0 0 2]
val_acc=0.7000
--------------------------------------------------------------------------------
[1 3 1 1 1 4 0 4 1 0 0 0 0 2 1 4 0 0 0 1]
[1 0 1 4 0 1 1 4 1 1 1 0 4 1 2 3 4 0 3 1]
val_acc=0.7375
--------------------------------------------------------------------------------
[0 4 0 1 1 1 0 0 0 0 0 1 0 0 0 4 0 2 1 0]
[1 4 1 0 2 1 2 4 3 4 0 0 3 3 2 4 5 2 2 1]
val_acc=0.6188
--------------------------------------------------------------------------------
[1 1 1 0 1 2 0 2 0 2 1 0 1 4 4 0 1 0 0 4]
[0 2 1 1 0 0 1 1 1 0 0 2 1 4 2 0 4 3 0 4]
val_acc=0.6500
--------------------------------------------------------------------------------
[0 2 0 0 1 3 1 2 1 0 0 0 0 0 0 1 0 0 1 0]
[1 3 0 1 0 4 2 0 1 0 2 2 2 3 0 0 5 2 2 0]
val_acc=0.6625
--------------------------------------------------------------------------------
Epoch 150: Eval
Eval at 42300
valid_accuracy: 0.6796
Eval at 42300
test_accuracy: 0.6660
Took 12 hours on a Titan X PASCAL, as advertised.
Now I think I'm supposed to take the architecture w/ the best validation, which is:
[1 3 1 1 1 4 0 4 1 0 0 0 0 2 1 4 0 0 0 1]
[1 0 1 4 0 1 1 4 1 1 1 0 4 1 2 3 4 0 3 1]
val_acc=0.7375
Is that right? Or is there somewhere where a large number of architectures are validated and an optimal one is chosen for me?
~ Ben
EDIT: Here's a plot of the valid and test accuracies over time:
These are the numbers logged like:
Eval at 42300
valid_accuracy: 0.6796
Eval at 42300
test_accuracy: 0.6660
from enas.
Thank you for dedicating the time and resource to run and verify our code.
We also looked at the architectures that were sampled in the time steps before and took the one with the overall best val_acc. However, I think the one you picked might work well too ๐
from enas.
Here's a plot of the test accuracy in cifar10_micro_final.sh
:
This used architectures:
fixed_arc="1 3 1 1 1 4 0 4 1 0 0 0 0 2 1 4 0 0 0 1"
fixed_arc="$fixed_arc 1 0 1 4 0 1 1 4 1 1 1 0 4 1 2 3 4 0 3 1"
w/ all other parameters unchanged.
Final test accuracy of 0.9612 at epoch 630 (w/ maximum test accuracy of 0.9620 at epoch 619). Also, got to 0.9611 accuracy at epoch 306 -- so the extra 300 iterations don't give a whole lot of improvement.
Note this final model takes > 1 day to train -- longer than the initial architecture search.
For me, this prompts the question of how much of the difference between methods reported in the paper is due to the hyperparameters of the final retraining step vs the discovered architecture. It'd be interesting to train a standard ResNet architecture w/ the same parameters as cifar10_micro_final.sh
to see how it compares.
from enas.
@bkj ๏ผHello.Is everything going well with this work(the macro search space, the micro search space)?
And I want to konw how to get the following architecture in the cifar10_macro_final.sh:
fixed_arc="0"
fixed_arc="$fixed_arc 3 0"
fixed_arc="$fixed_arc 0 1 0"
fixed_arc="$fixed_arc 2 0 0 1"
fixed_arc="$fixed_arc 2 0 0 0 0"
fixed_arc="$fixed_arc 3 1 1 0 1 0"
fixed_arc="$fixed_arc 2 0 0 0 0 0 1"
fixed_arc="$fixed_arc 2 0 1 1 0 1 1 1"
fixed_arc="$fixed_arc 1 0 1 1 1 0 1 0 1"
fixed_arc="$fixed_arc 0 0 0 0 0 0 0 0 0 0"
fixed_arc="$fixed_arc 2 0 0 0 0 0 1 0 0 0 0"
fixed_arc="$fixed_arc 0 1 0 0 1 1 0 0 0 0 1 1"
fixed_arc="$fixed_arc 2 0 1 0 0 0 0 0 1 0 1 1 0"
fixed_arc="$fixed_arc 1 0 0 1 0 0 0 1 1 1 0 1 0 1"
fixed_arc="$fixed_arc 0 1 1 0 1 0 1 0 0 0 0 0 1 0 0"
fixed_arc="$fixed_arc 2 0 0 1 0 0 0 0 0 0 0 1 0 1 0 1"
fixed_arc="$fixed_arc 2 0 1 0 0 0 1 0 0 1 1 1 1 0 0 1 0"
fixed_arc="$fixed_arc 2 0 0 0 0 1 0 1 0 1 0 0 1 0 1 0 0 1"
fixed_arc="$fixed_arc 3 0 1 1 0 1 0 0 0 0 0 1 0 1 0 1 0 0 0"
fixed_arc="$fixed_arc 3 0 1 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1"
fixed_arc="$fixed_arc 0 1 0 0 1 0 1 1 0 0 0 1 0 0 0 0 0 1 1 0 0"
fixed_arc="$fixed_arc 3 0 1 0 1 1 0 0 1 0 1 1 0 1 1 0 1 0 0 1 0 0"
fixed_arc="$fixed_arc 0 1 0 1 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0"
fixed_arc="$fixed_arc 0 1 1 0 0 0 1 1 1 0 1 0 0 0 1 0 1 0 0 1 1 0 0 0", which has 24 cells.
while ,I just can get architectures like ๏ผ
[1]
[1 1]
[5 0 0]
[5 0 0 0]
[0 0 1 1 0]
[1 1 0 0 0 0]
[1 1 0 1 1 1 0]
[3 0 0 1 0 1 1 1]
[5 0 0 1 0 0 1 0 0]
[1 1 1 0 0 0 0 1 0 0]
[0 1 1 0 0 0 0 1 1 1 1]
[0 0 1 1 1 1 0 1 0 0 1 1], which only has 12 cells.
from enas.
@bkj Hello, Did you change any parameters on the scripts to get this result. I only could get almost 0.88 acc on cifar using micro structure.
from enas.
@bkj Hi,I saw you you also are interested in the ENAS-pytorch work in https ://github.com/carpedm20/ENAS-pytorch. When I run the ENAS-pytorch code by : python main.py --network_type cnn --dataset cifar --controller_optim momentum --controller_lr_cosine=True --controller_lr_max 0.05 --controller_lr_min 0.0001 --entropy_coeff 0.1,I met many errors.What I want to do is to find some cnn architectures and make them visulized. Would you please tell me what changes Ishould do to the code before I run it? Thanks for your reply.
from enas.
Hi @hyhieu , I have run the latest default ./scripts/ptb_final.sh for 600+ epochs, but the ppl remains 69+. May I know what is the expected number of epochs needed to reproduce claimed 55.8 ppl? Or do I need to change ./scripts/ptb_final.sh configuration?
from enas.
OK -- tail of
cifar10_micro_search.sh
looks like:Eval at 42018 valid_accuracy: 0.6820 Eval at 42018 test_accuracy: 0.6636 epoch=149 ch_step=42050 loss=0.910298 lr=0.0005 |g|=2.4888 tr_acc=105/160 mins=717.36 epoch=149 ch_step=42100 loss=1.008317 lr=0.0005 |g|=3.0906 tr_acc=110/160 mins=717.86 epoch=149 ch_step=42150 loss=0.833895 lr=0.0005 |g|=2.0674 tr_acc=107/160 mins=718.36 epoch=149 ch_step=42200 loss=0.951047 lr=0.0005 |g|=2.4366 tr_acc=104/160 mins=718.85 epoch=149 ch_step=42250 loss=0.930920 lr=0.0005 |g|=2.1964 tr_acc=107/160 mins=719.35 epoch=150 ch_step=42300 loss=0.993480 lr=0.0005 |g|=2.3855 tr_acc=98 /160 mins=719.85 Epoch 150: Training controller ctrl_step=4470 loss=3.077 ent=53.16 lr=0.0035 |g|=0.0440 acc=0.7375 bl=0.68 mins=719.85 ctrl_step=4475 loss=1.252 ent=53.17 lr=0.0035 |g|=0.0088 acc=0.7000 bl=0.68 mins=720.05 ctrl_step=4480 loss=2.096 ent=53.14 lr=0.0035 |g|=0.0490 acc=0.7188 bl=0.68 mins=720.26 ctrl_step=4485 loss=3.848 ent=53.13 lr=0.0035 |g|=0.0474 acc=0.7625 bl=0.69 mins=720.46 ctrl_step=4490 loss=2.683 ent=53.13 lr=0.0035 |g|=0.1009 acc=0.7375 bl=0.69 mins=720.66 ctrl_step=4495 loss=-5.616 ent=53.09 lr=0.0035 |g|=0.1095 acc=0.5750 bl=0.69 mins=720.86 Here are 10 architectures [0 1 1 0 1 1 0 3 1 2 0 0 0 1 1 4 1 0 3 1] [0 1 0 1 1 0 1 1 0 2 2 3 2 1 3 4 4 0 5 1] val_acc=0.6375 -------------------------------------------------------------------------------- [0 1 1 1 0 2 0 1 1 0 1 3 0 1 1 4 0 1 0 4] [0 2 0 4 0 0 0 3 0 2 3 4 0 0 2 1 5 1 2 1] val_acc=0.6313 -------------------------------------------------------------------------------- [0 2 0 0 1 1 1 1 1 2 0 0 0 0 1 0 1 3 0 0] [0 1 0 4 0 1 1 2 0 2 3 4 1 0 3 4 5 4 4 1] val_acc=0.7000 -------------------------------------------------------------------------------- [1 2 1 2 0 1 1 1 0 0 1 1 1 0 1 0 1 1 0 4] [1 1 0 2 2 0 2 4 3 1 3 1 4 4 4 0 2 3 0 0] val_acc=0.6875 -------------------------------------------------------------------------------- [1 2 0 2 1 4 1 2 1 3 0 4 0 1 2 4 0 4 1 4] [1 0 1 0 0 2 0 2 1 4 2 4 4 3 1 0 5 0 5 3] val_acc=0.6500 -------------------------------------------------------------------------------- [1 3 1 0 1 4 0 1 2 1 1 4 1 4 1 1 0 1 1 1] [1 1 0 0 2 1 1 4 0 0 1 4 2 1 0 0 5 0 0 2] val_acc=0.7000 -------------------------------------------------------------------------------- [1 3 1 1 1 4 0 4 1 0 0 0 0 2 1 4 0 0 0 1] [1 0 1 4 0 1 1 4 1 1 1 0 4 1 2 3 4 0 3 1] val_acc=0.7375 -------------------------------------------------------------------------------- [0 4 0 1 1 1 0 0 0 0 0 1 0 0 0 4 0 2 1 0] [1 4 1 0 2 1 2 4 3 4 0 0 3 3 2 4 5 2 2 1] val_acc=0.6188 -------------------------------------------------------------------------------- [1 1 1 0 1 2 0 2 0 2 1 0 1 4 4 0 1 0 0 4] [0 2 1 1 0 0 1 1 1 0 0 2 1 4 2 0 4 3 0 4] val_acc=0.6500 -------------------------------------------------------------------------------- [0 2 0 0 1 3 1 2 1 0 0 0 0 0 0 1 0 0 1 0] [1 3 0 1 0 4 2 0 1 0 2 2 2 3 0 0 5 2 2 0] val_acc=0.6625 -------------------------------------------------------------------------------- Epoch 150: Eval Eval at 42300 valid_accuracy: 0.6796 Eval at 42300 test_accuracy: 0.6660
Took 12 hours on a Titan X PASCAL, as advertised.
Now I think I'm supposed to take the architecture w/ the best validation, which is:
[1 3 1 1 1 4 0 4 1 0 0 0 0 2 1 4 0 0 0 1] [1 0 1 4 0 1 1 4 1 1 1 0 4 1 2 3 4 0 3 1] val_acc=0.7375
Is that right? Or is there somewhere where a large number of architectures are validated and an optimal one is chosen for me?
~ Ben
EDIT: Here's a plot of the valid and test accuracies over time:
These are the numbers logged like:Eval at 42300 valid_accuracy: 0.6796 Eval at 42300 test_accuracy: 0.6660
so after you get the arch, have you retrained from scratch to get the acc of 96.5%?
from enas.
@axiniu I met the same problem,Has the problem been solved now?Thanks a lot
from enas.
@upwindflys @axiniu
I also just get the same 12 cell but my macro search get very high accuracy like
[2]
[3 0]
[5 1 0]
[5 0 0 1]
[2 0 0 0 1]
[1 0 0 0 0 0]
[5 1 0 1 0 0 0]
[1 0 0 0 1 0 0 0]
[1 0 0 0 0 1 0 0 0]
[5 0 1 0 0 0 1 1 0 1]
[4 0 0 0 1 0 1 0 1 0 0]
[1 0 0 0 0 0 0 1 1 0 1 1]
val_acc=0.8125
--------------------------------------------------------------------------------
Epoch 310: Eval
Eval at 109120
valid_accuracy: 0.8154
Eval at 109120
test_accuracy: 0.8080
But did you know the difference between 12 and 24 cell?
from enas.
Related Issues (20)
- Adapting ENAS for chess
- potential change of NHCW to NHWC in micro_child.py
- Do you have plans to update this project to Tensorflow2.0?
- Understanding of the output of the NAS HOT 5
- Reproducibility of the results from the paper (RNN) on new repository
- what is the batch size for the sampling? HOT 2
- enas_PTB child model
- Which version of tensorflow should I use? HOT 2
- How to restore final trained model??
- CNN architechure search run error on tensorflow1.13
- How much memory to run the cifar10 example HOT 4
- Not able to run on GPU HOT 1
- The output of micro search HOT 1
- How to cumpute the gradients of controller variables? HOT 1
- Enas
- About the architecture
- Library requirements
- Unknown data_format
- output ENAS
- Process finished with exit code -1073741571 (0xC00000FD)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from enas.