GithubHelp home page GithubHelp logo

inverseform's People

Contributors

mhofmann-qc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

inverseform's Issues

Confuse about predict scales and shifts and the loss

Hi:
In another issue, you said the inverseForm outputs 4 values which stand for scales and shifts, so if I want to minimize the inverseForm loss, the scales should be close to 1 and shifts should be chose to 0.But in your code, the loss is (((distance_coeffs*distance_coeffs).sum(dim=1))**0.5).mean(), which will push all of 4 values to 0, why ?
Looking forward to your reply. Thank you!

Design question

In your paper you describe InverseForm is used to compare b_pred with b_gt. The latter is obtained from running a Sobel filter on the ground truth segmentation . Is there any particular reason we don't do the same to obtainb_pred? Meaning, we could have just taken the predicted segmentation map, ran a sobel operation on that, and fed that as b_pred.

Put formally in your notation:

b_gt = sobel(y_gt)
So why not also
b_pred = sobel(y_pred)

how to visualize the result?

thank you for your great work
but could you please tell me how to visualize the result and save them?
thank you.

Usage in TensorFlow

How can I use this loss function in TensorFlow? Which portions need to be recoded?

I have trained stn with resnet using the ImageNet , then how to train inverseForm net

Hi! I have read your awesome paper and the appendices, and I have a few questions:

  1. Which dataset is used to train the stn ? ImageNet or mnist. if I use ImageNet to train stn, how many times should I down sample to get the feature before flatten the feature, and how about the learning rate should I choose in stn, is it the same with classification net?

2.if I have trained stn with resnet using the ImageNet , the how to train inverseForm net, In my opinion it looks like this:
a) first get the rgb input image from ImageNet, then get the gray edges using sobel filter.
b) Freeze the stn, Using rgb input image to get the theta and the affine matrix, and use gray edges to get the affine edges by the affine matrix.
c) Get the theta hat from the inverseForm Net by using the gray edges and the affine gray edges. Then get the loss from theta and theta hat
Is that correct?
3. Does it matter if I use different net(resnet or mobilenet) to train the stn net?
4. Whats's the loss function between the theta from stn and the theta hat from inverseForm, L2? or L1
Looking forward to your reply. Thank you!

question about visualization

Hi, from your code I guess the visualization code is :

        prediction = assets['predictions']
        prediction_cpu = np.array(prediction)
        label_out = np.zeros_like(prediction)
        submit_fn = '{}.png'.format(img_name)
        for label_id, train_id in   cfg.DATASET.id_to_trainid.items():
            label_out[np.where(prediction_cpu == train_id)] = label_id
        cv2.imwrite(os.path.join(self.save_dir, submit_fn), label_out)

but when I execute it, there is no picture saved, and the saving path is right, could you help me? Thanks

Edge Head Weights

Hello,

I wanted to ask if you would be willing to release the weights of the edge-head that is added to the default semantic segmentation architecture.

A question about InverseTransform2D

Hi,
In your code, the class InverseTransform2D calculates the loss of boundary distance. You use "inputs = torch.ge(inputs, 0.5).float()" to get boundary map. But it seems that the backward is interrupted. Please make sure that this item of loss can be backward. Cause I think the required_grad attribute of mean_square_inverse_loss is False.

Two confusions about the code

Hi,I have two confusions about the code:

  1. The InverseNet should predict 8 or 6 values according to Section 3.3 in your paper, but the InverseNet in your code only ouputs 4 values. Why? What do these 4 elements represent?
  2. Is the calculation of InverseForm loss in your code based on Euclidean distance or Geodesic distance?

Training the Inverseform Net on custom dataset

Hi! Based on your paper, I understand that you have retrained the IFLoss separately for each dataset (as specified in your supplementary materials). I am presently using a custom aerial imagery dataset for my experiments and would like to follow the same protocol of retraining IFLoss on this dataset. I hope you can offer some insights on how I could go about doing this. Even better, I would appreciate it if you could share the training script, which would save a great deal of time for me.

Also, I'm also curious as to why you have retrained the IFLoss for each dataset separately. Since the Inverseform network only used the GT seg masks to train, are the structure and shape of the GT seg masks so drastically different between the various datasets that IFLoss requires retraining on each dataset separately?

Looking forward to your reply. Thank you!

How can I use this work for cityscapes segmantation on Kitti dataset?

Thanks for your great work! And since you provide excellent performance on cityscapes benchmark, I just wonder whether I can use your pretrained checkpoints for semantic segmantation on Kitti dataset. If it supports, I'd greatly appreciate it for your generous advice for how to adjust your dataset config and so on.

Question about the code

Hi, I have a question about the loss code. In the function of ImageBasedCrossEntropyLoss2d, the result of nll_loss.weight is no difference no matter the batch_weights is setted for True or False. Is this the original purpose that the Ith target(target[i]) should be as the input to the weights calculate when the batch_weights is setted for False?

Not able to Run given validation/test code

While using the given command in README
"python -m torch.distributed.launch --nproc_per_node=8 experiment/validation.py --output_dir "/path/to/output/dir" --model_path checkpoints/hrnet48_OCR_IF_checkpoint.pth --arch "ocrnet.HRNet" --hrnet_base "48" --has_edge True"

getting below error
usage: validation.py [-h] [--tag TAG] [--no_run] [--interactive] [--no_cooldir] [--farm FARM] exp_yml

** All cmd line arguments are given as "unrecognized arguments"

Where can I find the appendix materials?

Dear authors,

Thanks for your great work! I am interested in how you conducted your tiling operations, and select your hyperparameters for training the net. You mentioned discussions about those are in the appendix. Could you kindly remind me where can I find the appendix materials for this paper? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.