feinanshan / fanet Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Hello once again,
I tried creating a training model and FANet-18 with cityscapes dataset. I replaced the InPlaceABN layers with normal BN followed by Activation as I needed to parse the trained model to ONNX for deploying in my application.
These are my training configurations highly adapted from the paper:
4
as I only have 8 gigs of GPU Memory, weight decay = 5e-4
, momentum = 0.9
1e-2
, with update to LR multiplied by a factor (1-(iter/max_iter)pow(2))
0.75 to 2
)80000
I resulted with a OHEM Cross Entropy loss of 0.3941
in the final iteration
I am yet to check the mIOU.
As a preliminary discussion I would like to compare it with BiseNet which was trained in a similar fashion but with auxiliary losses and resulted with a OHEM Cross Entropy Loss of 0.2947
which resulted in a mIOU of 0.63
Could you please give me more details on the training especially
Is there any other thing that I am missing out to achieve better results
Hi, Can you please provide the trained model for the last experiment reported in Table VI of the paper, which achieved mIoU of 75.5:
"TABLE VI: Video semantic segmentation on Cityscapes. “+Temp” indicates FANet with spatial-temporal attention (t=2)"
Thanks
Did I not understand the code properly? The cosine similarity is not used in FA, but the dot product is still used. The formula 1/n in the paper is not reflected in the code
Hi. Your paper is very interesting and inspirational to read.
I was wondering why you just integrated the features of ONE neighboring frame to facilitate the inference of current frame.
Have you experimented on more frames? What's the effect?
Thank you.
Hi Author, thanks for you work : ) Could you please release the torch and cuda version for this repo. It raise "RuntimeError: cuda runtime error (100) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:50" error when running the cmd CUDA_VISIBLE_DEVICES=1 python3 speeding.py on torch==1.1.0 and CUDA 10.2. Thanks!
Hi, can I ask for the supplementary material of this paper?
Hi,For fairness comparison, previous work including ICnet. BiSegNet, SFnet report the speed with bn, here you merge bn in conv which results in much higher speed.
Hi
Could you please provide license information for the repo? Is the code made available under an MIT license?
Thank you for the great work!!
The below lines are from the Resnet model return adapted for FANet
feat4 = self.layer1(x)
feat8 = self.layer2(feat4) # 1/8
feat16 = self.layer3(feat8) # 1/16
feat32 = self.layer4(feat16) # 1/32
return feat4, feat8, feat16, feat32
The comments specify that feat8, feat16, feat32 are feature maps of 1/8, 1/16, 1/32 of the image size but the actual sizes are 1/16, 1/32, 1/64.
I also understand why this is happening . Below is the way we create Resnet18 model for FANet
def Resnet18(pretrained=True, norm_layer=None, **kwargs):
model = ResNet(BasicBlock, [2,2,2,2],[2,2,2,2], norm_layer=norm_layer)
if pretrained:
model.init_weight(model_zoo.load_url(model_urls['resnet18']))
return model
which is different from the way we create it in BiseNet
def Resnet18(pretrained=False, **kwargs):
model = ResNet(BasicBlock, [2, 2, 2, 2],[1, 2, 2, 2])
if pretrained:
model.init_weight(model_zoo.load_url(model_urls['resnet18']))
return model
The stride value for the first BasicBlock is different.
Could you please clarify that if the additional down-sampling is intended for the FANet or not.
Will it be possible for you to share the trained model as well? I don't have access to good GPUs to train it on Cityscapes dataset. It would be nice if you can share the trained model
I want to know when the author can publish the code, thank you!
Hi,
Thanks for your great work.
I have a question regarding the feature map size of resnet blocks.
In your paper you say that the first res-block produces a feature map of h/4 x w/4 resolution.
But in the code the resolution of the feature map is h/8 x w/8
Is this an error ?
Or the implementation doesn't fully reproduce the paper description ?
Thank you
I am confusing about it! You defined batchnorm, but in the forward function, you only use activation, can you explain this?
`
class BatchNorm2d(nn.BatchNorm2d):
#(conv => BN => ReLU) * 2
def init(self, num_features, activation='none'):
super(BatchNorm2d, self).init(num_features=num_features)
if activation == 'leaky_relu':
self.activation = nn.LeakyReLU()
elif activation == 'none':
self.activation = lambda x:x
else:
raise Exception("Accepted activation: ['leaky_relu']")
def forward(self, x):
return self.activation(x)
`
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.