There are a number of differences between _fixed_layer</code

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Difference between `_fixed_layer` and `_enas_layer` in `cifar10/micro_child.py` about enas HOT 4 OPEN

melodyguan commented on May 20, 2024

Difference between `_fixed_layer` and `_enas_layer` in `cifar10/micro_child.py`

from enas.

Comments (4)

hyhieu commented on May 20, 2024

Hi Ben,

Thanks for the questions. I'll try.

The point of layer_base, which is just a 1x1 convolution, is to standardize the number of output channels to out_filters before performing the main operation in a convolutional cell or a normal cell. In _enas_layer, we do this in final_conv. The effect is almost the same, but we found it easier to implement this way.
I don't understand this point of yours. Both _fixed_layer and _enas_layer use both convolutions and pooling. For fixed_layer, I hope the code is quite straightforward. For _enas_layer, since we need to implement a somewhat dynamic graph, we separate the process into the function _enas_cell.
The purpose of _factorized_reduction is to reduce both spatial dimensions (width and height) by a factor of 2, and potentially to change the number of output filters. Where you mention it, this function is used to make sure that the outputs of all operations in a convolutional cell or a reduction cell will have the same spatial dimensions, so that they can be concatenated along the depth dimension.

The reason why we cannot just fix normal_arc and reduce_arc and use the same code for both the search process and fixed-architecture process is efficiency. Dynamic graphs in TF, at least the way we implement them, are slow and very memory inefficient.

Let us know if you still have more questions 😃

from enas.

bkj commented on May 20, 2024

For number 2, the point was that you're using pooling w/ stride > 1 in the fixed architecture, but a combination of _factorized_reduction and pooling w/ stride = 1 in the ENAS cells.

Makes sense about the dynamic graphs being slow.

Thanks for the quick response. (And thanks for releasing the code! I've been working on a similar project for a little while, so am very excited to compare what I've done to your code.)

~ Ben

from enas.

hyhieu commented on May 20, 2024

For number 2, the point was that you're using pooling w/ stride > 1 in the fixed architecture, but a combination of _factorized_reduction and pooling w/ stride = 1 in the ENAS cells.

I think it's just because we couldn't figure out how to syntactically make _factorized_reduction run with the output of a dynamic operation, such as tf.case.

from enas.

stanstarks commented on May 20, 2024

@hyhieu I am wondering if the reduction cell in _fixed_layer and _enas_layer have the same previous layers
result of _factorized_reduction is appended to the layers

If I understand it correctly, to make the previous layers consistent, this line should be

layers = [layers[0], x]

from enas.

Difference between `_fixed_layer` and `_enas_layer` in `cifar10/micro_child.py` about enas HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs