GithubHelp home page GithubHelp logo

d-x-y / nats-bench Goto Github PK

View Code? Open in Web Editor NEW
168.0 168.0 19.0 575 KB

TPAMI 2021: NATS-Bench: Benchmarking NAS Algorithms for Architecture Topology and Size

License: MIT License

Python 67.62% Jupyter Notebook 32.38%

nats-bench's Introduction

Hi there 👋

Anurag's github stats

nats-bench's People

Contributors

ain-soph avatar d-x-y avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

nats-bench's Issues

How to generate a computational graph from an architecture from the size search space?

Describe the Question
A clear and concise description of the question.

  • Is it about the topology search space in NATS-Bench? No
  • Is it about the size search space in NATS-Bench? Yes
  • Which figure or table are you referring to in the paper? N/A

I would like to generate a computational graph from a sample from the size search space. Do you provide any functionality for that? I presume that you must have used this functionality when creating this dataset, so if it is not available within the NATS-Bench codebase, can you give some pointers as to how I can do this?

Thanks!

Baidu-Pan links became invalid

Describe the Question
Great thanks for your nice work!

  • Is it about the topology search space in NATS-Bench?
  • Yes
  • Is it about the size search space in NATS-Bench?
  • No
  • Which figure or table are you referring to in the paper?
  • Especially the download link of NATS-tss-v1_0-3ffb9-full is invalid. Could you please update this link?

Best regards.

get_cost_info(hp="200") returns weird value for some models in topology search space

Describe the bug
For some architectures, get_cost_info() returns different flops/params for hp="12" and hp="200" in topology search space.
For example, the number of parameters (for CIFAR-10) of architecture |skip_connect~0|+|none~0|nor_conv_3x3~1|+|avg_pool_3x3~0|nor_conv_3x3~1|nor_conv_3x3~2| is 0.802426 for hp="12", but for hp="200", the value is 0.6403993333333333.

I've found many other architectures in which the same behavior occurs, but some specific examples are as follows:

  • |nor_conv_1x1~0|+|nor_conv_3x3~0|skip_connect~1|+|none~0|skip_connect~1|none~2|
  • |nor_conv_3x3~0|+|skip_connect~0|skip_connect~1|+|none~0|skip_connect~1|none~2|

To Reproduce

from nats_bench import create

arch = "|skip_connect~0|+|none~0|nor_conv_3x3~1|+|avg_pool_3x3~0|nor_conv_3x3~1|nor_conv_3x3~2|"
api_path = "NATS-tss-v1_0-3ffb9-simple"
search_space = "tss"
dataset = "cifar10"

api = create(api_path, search_space, fast_mode=True, verbose=False)
idx = api.query_index_by_arch(arch)
info_12 = api.get_cost_info(idx, dataset, hp="12")
info_200 = api.get_cost_info(idx, dataset, hp="200")
print(info_12)
print(info_200)
{'flops': 113.95137, 'params': 0.802426, 'latency': 0.016719988563604522, 'T-train@epoch': 21.51544686158498, 'T-train@total': 258.1853623390198, 'T-ori-test@epoch': 1.496117415882292, 'T-ori-test@total': 17.953408990587505}
{'flops': 90.35841, 'params': 0.6403993333333333, 'latency': 0.016719988563604522, 'T-train@epoch': 21.515446861584987, 'T-train@total': 4303.0893723169975, 'T-ori-test@epoch': 1.4961174158822923, 'T-ori-test@total': 299.22348317645844}
  • OS: Ubuntu 20.04.3 LTS (Focal Fossa)
  • Python version: 3.8.12
  • PyTorch version: 1.8.1+cu102

Expected behavior
I think flops/params for hp="12" and hp="200" should be the same.
(From reading the paper and your implementation in AutoDL-Projects, I could not find any factors that would make a difference.)

There are some problems with the accuracy of the model

Which Algorithm
Size Search Space
Cifar 10

Describe the Question
I get all 32768 architecture candidates‘ acc using the code followed, but all of them in under 90%.
The results in your paper of all method is above 90%. How can I get the result of table 4 in your paper?

from nats_bench import create
model_cifar10_rank = {}
api = create(None, 'sss')
for index in range(32768):
info = api.get_more_info(index, 'cifar10')
config = api.get_net_config(index, 'cifar10')
model_cifar10_rank[config['channels']] = info['test-accuracy']

models_cifar10_acc_rank 2.txt

The relation between configuration and architecture string

Is the relation between architecture and edge as follows?:

arch = '|{}~0|+|{}~0|{}~1|+|{}~0|{}~1|{}~2|'.format(
        config.edge_0_1,
        config.edge_0_2,
        config.edge_1_2,
        config.edge_0_3,
        config.edge_1_3,
        config.edge_2_3
)

Note that edge_i_j denotes the edge from the i-th vertex to the j-th vertex.

The details of results in the paper

Hi, I want to use the NATS-Bench, but I still have some questions about the results of benchmark algorithms given in the paper.
(1) Hp=200 in tss and hp=90 in sss, right?
(2)the results is top1 or top5 or others?
(3)Sorry, but can I trouble you to wirte a usage example to show how can I get the correct metrics when I search a new arch?
Thanks!

What is the seed of subnet training on cifar10, 111 or 777?

Describe the Question

  • Is it about the topology search space in NATS-Bench?
  • yes

I use the following code to query the accuracy of the specified subnet, I then retrained the subnet with seeds 111 and 777, but couldn't get the same accuracy as the query,Could you tell me the seeds of training?

from nats_bench import create
from xautodl.models import CellStructure

model_str = '|nor_conv_1x1~0|+|nor_conv_3x3~0|nor_conv_1x1~1|+|skip_connect~0|none~1|nor_conv_1x1~2|'
arch = CellStructure.str2structure(model_str)
api = create(None, 'tss', fast_mode=True, verbose=False)
api.reset_time()
accuracy, _, _, total_cost = api.simulate_train_eval(
    arch, 'cifar10', hp="12"
)
print(f"accuracy {accuracy},  total_cost  {total_cost}")

About REINFORCE

你好!
我对 AutoDL-Projects-main/exps/NATS-algos/reinforce.py 中policy的实现有些疑惑,我理解基于RL策略的NAS算法中agent的架构一般是RNN,但这里 PolicyTopology 的网络架构并不是RNN(我没太理解这个网络架构所蕴含的含义,只知道参数的大小[6边, 5动作候选集]),且forward没有输入.那么请问这里对 REINFORCE 的实现机理是什么呢?

FLOPS data in nasbench201 are different from flops count tools like thop and ptflops

FLOPS counts

  • about the topology search space in NATS-Bench?

Hi, the question is that when I using the flops counts tools to calculate the flops of the model in nasbench201, the results are always larger than the flops offered by 201 dataset, for example arch index 0, cifar10-valid flops 15.64737, cifar100 15.65322, and imagenet16 flops 3.91948, however from the thop tools they are 16.464576, 16.470336, 4.123712, from the ptflops tools they are 16.810634, 16.816484, 4.210296. But either 201 dataset or the above tools offer the same params counts.

are there any operations that the model does not count?

Many thanks for your excellent works and code!

Question regarding the design space for the topology search space in NATS-Bench

Hello Dr. Xuanyi Dong,
Thanks for providing this clear and well-documented NAS tool.
I have a question regarding the design space for the topology search space in NATS-Bench.

Especially, I am referring to the "6.5k" unique architecture for topology search space in TABLE 1.
Could you elaborate on how did you conclude "6.5k" unique architecture?
According to NAS-bench-201, shouldn't it be 5^6 = 15,625 instead of 6.5k?

image

Thanks for your help in advance.
Hung-Yang

Is NATS Extension of NAS_201 bench

I was working with NAS_201 bench earlier and now am shifting to NATS bench. I have the following doubts:

  • Will the index of architecture obtained with NAS_201 and NATS bench same for a given arch? Because when using both NAS_201 is giving 12804 index while NATS bench is returning -1. Also the index of NAS_201 provides different config arch when used as index in NATS.
  • While obtaining weights I am getting an empty return when using the NAS_201 index.

Following is my implementation. I am using benchmark file with sss.

api = create(d, 'sss', fast_mode=False, verbose=True)
index = api.query_index_by_arch(convert_naslib_to_str(best_arch))
config = api.get_net_config(index, 'cifar10')
best_arch = get_cell_based_tiny_net(config)
logger.info("Queried results ({}): {}".format(metric, best_arch))
params = api.get_net_param(index, 'cifar10', None)
best_arch.load_state_dict(next(iter(params.values())))

Regarding the checkpoints

Hello.
I would like to know if the checkpoints split are independent to each others. I meant if i download one of those 30Gb split does it contains all the information of a set of specific architectures or do i need to download others checkpoints and jointly unzip them in order to be able to use them.

Where can i find the transform classes for inference if i want to evaluate the retrieved checkpoint to reproduce same test accuracy in the model information?
thank you?

best accuracy find

Describe the bug
A clear and concise description of what the bug is.
We find the best accuracy is different from paper given In tss search space when. And in cifar10 dataset, I can not find valid data(valid accuracy). Is it leavet out?
To Reproduce
Please provide a small script to reproduce the behavior:

from nats_bench import create
api = create('D:\Download\coreg-master\\NATS-tss-v1_0-3ffb9-simple', 'tss', fast_mode=True, verbose=True)
valid_acclist = []
    test_acclist = []
    netlist = []
    for i in range(0, 15625):
        info = api.get_more_info(i, 'cifar10', hp='200')
        config = api.get_net_config(i, 'cifar10')  #ImageNet16-120
        valid_acclist.append(info['valid-accuracy']) #I can not find valid-accuracy from cifar10, but it exist in cifar100 and imagenet
        test_acclist.append(info['test-accuracy']) #I find best test acc in cifar10 is 94.56 but not 94.37 in cifar10. It also happened in cifar100 and Imagenet

        netlist.append(config)

    max(valid_acclist)
codes to reproduce the bug

Please let me know your OS, Python version, PyTorch version.
windows 10, python 3.7 torch 1.9.0

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

'tss' node connection?

Describe the Question
A clear and concise description of the question.

  • Is it about the topology search space in NATS-Bench? Yes
  • Is it about the size search space in NATS-Bench? No
  • Which figure or table are you referring to in the paper? Not Found.

Hi, after create bench API, query the relative information as followed:

In [7]: tss.query_info_str_by_arch(1)
[2021-06-23 08:09:56] Call query_info_str_by_arch with arch=1and hp=12
[2021-06-23 08:09:56] Call query_index_by_arch with arch=1
[2021-06-23 08:09:56] Call clear_params with archive_root=../NATS-tss-v1_0-3ffb9-simple and index=1
Out[7]: "|nor_conv_3x3~0|+|nor_conv_3x3~0|avg_pool_3x3~1|+|skip_connect~0|nor_conv_3x3~1|skip_connect~2|\ndatasets : ['cifar10-valid', 'cifar10', 'cifar100', 'ImageNet16-120'], extra-info : arch-index=1\ncifar10-valid  FLOP=113.95 M, Params=0.802 MB, latency=16.85 ms.\ncifar10-valid  train : [loss = 0.382 & top1 = 86.97%], valid : [loss = 0.514 & top1 = 82.83%]\ncifar10        FLOP=113.95 M, Params=0.802 MB, latency=16.85 ms.\ncifar10        train : [loss = 0.243 & top1 = 91.69%], test  : [loss = 0.362 & top1 = 88.22%]\ncifar100       FLOP=113.96 M, Params=0.808 MB, latency=15.36 ms.\ncifar100       train : [loss = 1.271 & top1 = 63.76%], valid : [loss = 1.495 & top1 = 57.80%], test : [loss = 1.478 & top1 = 58.26%]\nImageNet16-120 FLOP= 28.50 M, Params=0.810 MB, latency=13.77 ms.\nImageNet16-120 train : [loss = 2.548 & top1 = 35.41%], valid : [loss = 2.580 & top1 = 35.43%], test : [loss = 2.611 & top1 = 33.80%]"`

what the nor_ means? and which one presents the connection between node0 and node1?

Do arch indexes match across nasbench201 and natsbench?

Say I lookup a model's information by index 235 in natsbench. Is the architecture corresponding to id 235 same in earlier nasbench201 as well?

I am comparing to some other code which is using nasbench201 api and I wanted to make sure that things were comparable. Thanks!!

About getting the size in Bytes of the possible candidates

I'm having some issues to use one of those candidates of the Size Search Space when it comes to measure the size in bytes of a specific candidate.
I know to do it using the tensorflow lite kit, but I don't have any idea how to get the model to a torch.nn object and measure it.

Google Drive Full Archive Files Corrupted

Some of the full archive tar files (from the google drive) are corrupted. After downloading and attempting to extract them, software (7zip, winrar, winzip) will fail to open them, stating "can't open archive" or that the header file is corrupted or missing. This is only some of the files; we've confirmed a, c, e k and n work, and confirmed that b, l and m don't work.

How to use cifar10 in NASBench201

Describe the Question
A clear and concise description of the question.

  • topology search space in NATS-Bench

Hi, I have some questions about how to use val and test acc when searching on cifar10 subset.

First, when training an arch search algo, we use val acc as labels during the training procedure. And to evaluate the algo, the searched arch's test acc is queried as result. This works for cifar100 and imagenet16 subset since they offer val and test acc(I really don't know what valtest_acc for)

For cifar10, there are two subsets: cifar10-val and cifar10.
The former offers val and test acc, so the above train and eval pipeline still works,

截屏2021-10-20 下午11 05 35

but the latter only offers test acc, so when using the latter subset, I can not build labels since there are no val acc.

截屏2021-10-20 下午11 06 00

Above information is offered by api.get_more_info

Many thanks

No testset in ImageNet16

There is no specified test set in the Google Drive sharings.
I am not sure which part of data should be used when testing the models' accuracy.

Units/explanation of info data structure

info

Hi, First of all thanks for this tremendous service to the NAS research community. I have a few clarification questions:
What do the following fields in info = api.get_more_info(arch_index, 'cifar10') datastructure mean:
train-per-time, train-all-time, test-per-time, test-all-time?

cost_info

Also in cost_info = api.get_cost_info(arch_index, 'cifar10') what do the following field mean?
T-train@epoch, T-train@total, T-ori-test@epoch, T-ori-test@total

If these could be documented more explicitly it will be very useful. Thanks again!

get_net_param returns empty dictionary

Describe the bug
When retrieving network parameters with get_net_param I obtain an empty dict that I clearly cannot import as state_dict of a Pytorch model.

To Reproduce
Please provide a small script to reproduce the behavior:

from nats_bench import create
nats_api = create("NATS-tss-v1_0-3ffb9-simple", search_space="topology", fast_mode=True, verbose=False)
# sample random idx to simulate behavior
random_idx = nats_api.random()
# get corresponding architecture on CIFAR10
random_architecture = nats_api.get_net_config(index=random_idx, dataset="cifar10")
# retrieve network parameters according to main README.md
random_params = nats_api.get_net_param(index=random_idx, dataset="cifar10", seed=None)
print(random_params)
# prints {111: None}
print(next(iter(random_params.values())))
# prints None

OS: macOS Ventura 13.0.1
Python version: 3.10.8
Pytorch version: 1.13.0
Expected behavior
random_params should be a meaningful dictionary that I could use as params dict for a torch model

The reasults of test accuracy

Describe the Question
A clear and concise description of the question.

  • topology search space in NATS-Bench

Hi, I am confused by how to query one arch's test accuracy correctly. I did a test about querying one arch in two different ways. here are the code and outputs:

from nats_bench import create

search_space = 'tss'

api = create(None, search_space, fast_mode=True, verbose=True)

print(50 * '-')
info = api.get_more_info(
            1,
            dataset = "cifar10-valid",
            hp="200" if search_space == "tss" else "90",
            # hp='12',
            is_random=False
        )
print(info)

print(50 * '-')

info = api.query_info_str_by_arch(
    1, "200" if search_space == "tss" else "90"
)
print(info)

1641286810(1)

As the picture shows, the same arch gets different test accuracy. So what is the difference between these two ways? and which one you used in your paper.

Loading the pre-trained weights

Hi, I have got the network instance by running:
api = create('/path/to/NATS-sss-v1_0-50262-simple', 'sss', fast_mode=True, verbose=True)
config = api.get_net_config(12, 'cifar10')
network = get_cell_based_tiny_net(config)

everything works until here, but when I want to load the pretrained weights on the network like below:

params = api.get_net_param(12, 'cifar10', None)
network.load_state_dict(next(iter(params.values())))
I get the error below:


AttributeError Traceback (most recent call last)
/tmp/ipykernel_23823/187443033.py in
----> 1 network.load_state_dict(next(iter(params.values())))

/anaconda/envs/py38_pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
1377 # copy state_dict so _load_from_state_dict can modify it
1378 metadata = getattr(state_dict, '_metadata', None)
-> 1379 state_dict = state_dict.copy()
1380 if metadata is not None:
1381 # mypy isn't aware that "_metadata" exists in state_dict

AttributeError: 'NoneType' object has no attribute 'copy'

Can you please help me how can I have a pretrained network?

The unit for the number of parameters

Hi,

Thanks very much for your work.

Regarding the number of parameters for each network architecture, what is the unit (e.g. million)?

Thank you.

Screen Shot 2021-05-26 at 2 38 42 pm

How to generate a architecute model with torch

In the README.MD, there is only one example to generate an architecture at index 12 with get_cell_based_tiny_net.

But in the codebase, I found there are many functions including:

  1. get_cell_based_tiny_net
  2. obtain_model

Which is the real model at index 12? (The one whose performance is recorded in get_more_info).

If I want to obtain the model/architecture at the index 12 whose performance is the exactly measured and recorded in get_more_info. Which method should i use ?

Thank you

Question about ImageNet16120

I am not sure if there is val/test part in ImageNet16120 dataset. I only found files of train set and val set in document you provided.
But in Readme document, the test set mentioned is actually val set, according to your source code.

If the val set is used for test or just wrongly named, did you use the whole training set for model training without spliting?

The results of validation/test accuracy in NATS-Bench paper

Hi! I want to get the validation and test accuracy as Table 4 in "NATS-Bench: Benchmarking NAS Algorithms for Architecture Topology and Size" paper. I just want to check the following commands are correct or not:

After finishing the architecture search (I'm studying Weight-Sharing approach), I get the genotype. Then, I get the arch_index via:
arch_index = api.query_index_by_arch('......genotype here......')
Therefore, for Cifar10 validation accuracy:
info = api.get_more_info(arch_index, 'cifar10-valid', hp=200)
for Cifar10 test accuracy:
info = api.get_more_info(arch_index, 'cifar10', hp=200)
for Cifar100 validation/test accuracy:
info = api.get_more_info(arch_index, 'cifar100', hp=200)
for ImageNet16-120 validation/test accuracy:
info = api.get_more_info(arch_index, 'ImageNet16-120', hp=200)

Following are some points I want to check:

  1. Does Cifar10 test accuracy results use train + valid set for training and test set for testing? Thus, I should use 'cifar10' instead of 'cifar10-valid' to get test accuracy?
  2. What does valtest-accuracy mean in Cifar100 and ImageNet16-120?
  3. I get the architecture with validation accuracy higher than Optimal values reported in the paper for ImageNet16-120. Why?
    image

Great Thanks!

Question about zeroize/none operation

Describe the Question
A clear and concise description of the question.

  • Is it about the topology search space in NATS-Bench? Y
  • Is it about the size search space in NATS-Bench? N
  • Which figure or table are you referring to in the paper? N

Hi, I am trying to use [2] to search in nats-bench(NATS) [1], but there are some differences in the model's presentations between NATS and [2]. So I have to do some transfer, but I still confuse about NATS presentation.

Let us check this arch:

In [12]: tss.query_info_str_by_arch(2211)
[2021-06-24 04:52:46] Call query_info_str_by_arch with arch=2211and hp=12
[2021-06-24 04:52:46] Call query_index_by_arch with arch=2211
[2021-06-24 04:52:46] Call clear_params with archive_root=../NATS-tss-v1_0-3ffb9-simple and index=2211
Out[12]: "|none~0|+|none~0|nor_conv_3x3~1|+|skip_connect~0|skip_connect~1|nor_conv_1x1~2|\ndatasets : ['cifar10-valid', 'cifar10', 'cifar100', 'ImageNet16-120'], extra-info : arch-index=2211\ncifar10-valid  FLOP= 47.10 M, Params=0.344 MB, latency=15.46 ms.\ncifar10-valid  train : [loss = 0.710 & top1 = 75.25%], valid : [loss = 0.758 & top1 = 73.74%]\ncifar10        FLOP= 47.10 M, Params=0.344 MB, latency=15.46 ms.\ncifar10        train : [loss = 0.577 & top1 = 80.37%], test  : [loss = 0.601 & top1 = 78.78%]\ncifar100       FLOP= 47.11 M, Params=0.350 MB, latency=14.45 ms.\ncifar100       train : [loss = 2.019 & top1 = 46.12%], valid : [loss = 2.052 & top1 = 46.16%], test : [loss = 2.058 & top1 = 44.88%]\nImageNet16-120 FLOP= 11.78 M, Params=0.351 MB, latency=14.66 ms.\nImageNet16-120 train : [loss = 3.214 & top1 = 22.78%], valid : [loss = 3.158 & top1 = 24.83%], test : [loss = 3.216 & top1 = 22.53%]"

I assume none~ means zeroize in Fig.1[1], In this model, node0->node1 w/o connection, is that means node1->node2 connection failed even with operations presentation (nor_conv_3x3 in this case), same as node1->node3? If so, the cell is identity cell, w/o any connection?

Refer:
[1] NATS-Bench: https://arxiv.org/pdf/2009.00437.pdf
[2] Neural Predictor for Neural Architecture Search: https://arxiv.org/pdf/1912.00848.pdf

Empty evaluated_indexes

Describe the bug
When using find_best(), got error ValueError: invalid index : -1 vs. 15625..
After checking the code, the problem seems to be self.evaluated_indexes is an empty set.
I've downloaded the benchmark file and archive and put them in $TORCH_HOME.
If I added one line of code api.evaluated_indexes = set(range(amount)) to the script, the code works but the script is killed after index=14587

To Reproduce
Please provide a small script to reproduce the behavior:

from nats_bench import create

api = create(None, 'tss', fast_mode=True, verbose=True,)

best_acc = 0
best_arch = -1
amount = len(api)
# api.evaluated_indexes = set(range(amount))

index, acc = api.find_best(dataset='cifar10', metric_on_set='val')
# print(index)
# for i, arch in enumerate(api):
#     info = api.get_more_info(i, 'cifar10')
#     test_acc = info['test-accuracy']
#     if test_acc > best_acc:
#         best_acc = test_acc
#         best_arch = arch
#         print(best_arch)

Please let me know your OS, Python version, PyTorch version.
OS: Ubuntu 20.04.1 LTS
Python: Python 3.6.8 :: Anaconda, Inc.
PyTorch: 1.7.1+cu110

Expected behavior
I want to use find_best() to find the best cell structure

Screenshots
If applicable, add screenshots to help explain your problem.
Screenshot from 2021-04-08 13-16-58
Screenshot from 2021-04-08 13-18-40

Size of a network instance is so small, after being trained

Hi, I have sampled an architecture by network = get_cell_based_tiny_net(config), and then simply trained it and then saving the whole architecture object with torch.save(network, fname).
When I check the size of a network, it is only 1.15 MB. Is it normal that I get so much lightweight network?
Best

ValueError: can not find random seed (999) from [777, 888]

As the title mentions, I got the error by the following code:

import os
import nats_bench

DATA_DIR = f"{os.environ['HOME']}/research/data/NATS-tss-v1_0-3ffb9-simple"
name = 'ImageNet16-120'
data = nats_bench.create(DATA_DIR, 'tss', fast_mode=True, verbose=False)

data.get_more_info(index=346, dataset=name, hp='200', iepoch=199, is_random=999)
>>> No error

data.get_more_info(index=347, dataset=name, hp='200', iepoch=199, is_random=999)
>>> ValueError

The trained data of seed 999 for some architectures are not available?
Or am I doing something wrong?

Details about the data structures more_info and cost_info

Hello, first of all, thank you very much for your work in the  field of NAS. Here are my questions about NATS-Bench,and hope to get your confirmation:

image

  1. What are the specific units for results such as 'train-all-time'? ( for example, minutes?)
  2. What does training cost mean specifically? In particular, what is the difference between 'train-all-time' and 'T-train@total'?

Unable to use benchmark file

I have downloaded archive(tss) file and used tar command then uploaded the folder ungenerated on google drive to access it from colab
for below code:
from nats_bench import create
api = create('/content/drive/MyDrive/NATS-tss-v1_0-3ffb9-simple/', 'tss', fast_mode=True, verbose=True)

I get:
FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/MyDrive/NATS-tss-v1_0-3ffb9-simple//meta.pickle.pbz2.pbz2'
why is that?

Training time per epoch reproduction

**I am trying to reproduce the training pipeline for architectures in tss. For example architecture index 284 shows training per epoch time of 9.75 seconds but in my code where I am keeping the hyperparameters same as in Table 2 in the paper, I am getting around 50 seconds per epoch on a 2080Ti gpu. There are probably minor differences in the trianing pipeline but wondering if I am missing a big detail somewhere. **

Problem recreating api in fastmode

Hey,
I use the tss api with the simple dataset and it works fine first time, but when i try to recreate the api from the path that the simple dataset is placed in i encounter an error telling me that the meta.pickle.bz was not found. the solution is to unarchive the tar file everytime.
Am I missing sth?
Thanks in advance

get_model_from_arch_str

Describe the Question
is there a function for getting the model instance from the cell string? this function will be very useful for nas with training method?

Documentation for NATS Bench data format

I downloaded the archive with the evaluations of NATS Bench and I would like to get more information about the format of the data. In particular, the data linked in this page https://pypi.org/project/nats-bench/ at the 'archive' links.

I already have software for running my experiments, I would just like to convert the data format instead of integrating a new library, which may be more suitable for a different setting.

Of course, I will cite the paper as suggested.

change the candidate's input resolution

Hi, I would like to use you NATS-Bench for other datasets except cifar and Imagenet, and with higher resolutions like (256*256). Is it possible to sample a network like what you did for cifar below and then change the cells resolutions?

import xautodl, nats_bench

from nats_bench import create
from xautodl.models import get_cell_based_tiny_net

api = create(None, 'tss', fast_mode=True, verbose=True)

config = api.get_net_config(12, 'cifar10')
network = get_cell_based_tiny_net(config)

#then a code to change the input resolution to the target size of 256*256

Thanks for your response

Different number of training epochs

**Each architecture is trained for 12 (H^0), 200 (H^1), and 90 (H^2) epochs. If only these two epoch numbers are used, what does it mean to report test accuracies at all epochs of training 'ori-test@0', 'ori-test@1', 'ori-test@2'......'ori-test@199'. Shouldn't there be just ori-test@12 and ori-test@199?
**

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.