GithubHelp home page GithubHelp logo

fanet's People

Contributors

feinanshan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

fanet's Issues

Training Details

Hello once again,

I tried creating a training model and FANet-18 with cityscapes dataset. I replaced the InPlaceABN layers with normal BN followed by Activation as I needed to parse the trained model to ONNX for deploying in my application.

These are my training configurations highly adapted from the paper:

  1. Mini-Batch SGD with Batch Size 4 as I only have 8 gigs of GPU Memory, weight decay = 5e-4, momentum = 0.9
  2. Initial Learning Rate(LR) = 1e-2, with update to LR multiplied by a factor (1-(iter/max_iter)pow(2))
  3. Data Augmentation - Horizontal Flipping, random scaling (0.75 to 2)
  4. Training iterations 80000

I resulted with a OHEM Cross Entropy loss of 0.3941 in the final iteration

I am yet to check the mIOU.

As a preliminary discussion I would like to compare it with BiseNet which was trained in a similar fashion but with auxiliary losses and resulted with a OHEM Cross Entropy Loss of 0.2947 which resulted in a mIOU of 0.63

Could you please give me more details on the training especially

  1. What is the number of Iterations you trained the model for?
  2. What was the Final Cross Entropy loss you ended up with?
  3. Did you use Auxiliary losses as well for better Convergence resulting in lower loss?
  4. Did you have any specific modified version of Cross Entropy Loss to achieve better convergence?

Is there any other thing that I am missing out to achieve better results

Trained model with mIoU of 75.5 for fa2

Hi, Can you please provide the trained model for the last experiment reported in Table VI of the paper, which achieved mIoU of 75.5:
"TABLE VI: Video semantic segmentation on Cityscapes. “+Temp” indicates FANet with spatial-temporal attention (t=2)"

Thanks

The cosine similarity is not used in FA???

Did I not understand the code properly? The cosine similarity is not used in FA, but the dot product is still used. The formula 1/n in the paper is not reflected in the code

why only 2 frames for spatial-temporal context aggregation

Hi. Your paper is very interesting and inspirational to read.
I was wondering why you just integrated the features of ONE neighboring frame to facilitate the inference of current frame.
Have you experimented on more frames? What's the effect?

Thank you.

Torch/Cuda version

Hi Author, thanks for you work : ) Could you please release the torch and cuda version for this repo. It raise "RuntimeError: cuda runtime error (100) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:50" error when running the cmd CUDA_VISIBLE_DEVICES=1 python3 speeding.py on torch==1.1.0 and CUDA 10.2. Thanks!

License for the code

Hi

Could you please provide license information for the repo? Is the code made available under an MIT license?

Features extracted by ResNet are different for BiseNet and FANet

Thank you for the great work!!

The below lines are from the Resnet model return adapted for FANet

feat4 = self.layer1(x)
feat8 = self.layer2(feat4) # 1/8
feat16 = self.layer3(feat8) # 1/16
feat32 = self.layer4(feat16) # 1/32
return feat4, feat8, feat16, feat32

The comments specify that feat8, feat16, feat32 are feature maps of 1/8, 1/16, 1/32 of the image size but the actual sizes are 1/16, 1/32, 1/64.

I also understand why this is happening . Below is the way we create Resnet18 model for FANet

def Resnet18(pretrained=True, norm_layer=None, **kwargs):
    model = ResNet(BasicBlock, [2,2,2,2],[2,2,2,2], norm_layer=norm_layer)
    if pretrained:
        model.init_weight(model_zoo.load_url(model_urls['resnet18']))
    return model

which is different from the way we create it in BiseNet

def Resnet18(pretrained=False, **kwargs):
    model = ResNet(BasicBlock, [2, 2, 2, 2],[1, 2, 2, 2])
    if pretrained:
        model.init_weight(model_zoo.load_url(model_urls['resnet18']))
    return model

The stride value for the first BasicBlock is different.

Could you please clarify that if the additional down-sampling is intended for the FANet or not.

Will it be possible for you to share the trained model as well? I don't have access to good GPUs to train it on Cityscapes dataset. It would be nice if you can share the trained model

Resnet block feature map size

Hi,
Thanks for your great work.
I have a question regarding the feature map size of resnet blocks.
In your paper you say that the first res-block produces a feature map of h/4 x w/4 resolution.
But in the code the resolution of the feature map is h/8 x w/8

Is this an error ?
Or the implementation doesn't fully reproduce the paper description ?

Thank you

class BatchNorm2D in fanet.py

I am confusing about it! You defined batchnorm, but in the forward function, you only use activation, can you explain this?

`
class BatchNorm2d(nn.BatchNorm2d):
#(conv => BN => ReLU) * 2
def init(self, num_features, activation='none'):
super(BatchNorm2d, self).init(num_features=num_features)
if activation == 'leaky_relu':
self.activation = nn.LeakyReLU()
elif activation == 'none':
self.activation = lambda x:x
else:
raise Exception("Accepted activation: ['leaky_relu']")

def forward(self, x):
    return self.activation(x)

`

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.