qualcomm-ai-research / inverseform Goto Github PK
View Code? Open in Web Editor NEWLicense: BSD 3-Clause Clear License
License: BSD 3-Clause Clear License
Hi:
In another issue, you said the inverseForm outputs 4 values which stand for scales and shifts, so if I want to minimize the inverseForm loss, the scales should be close to 1 and shifts should be chose to 0.But in your code, the loss is (((distance_coeffs*distance_coeffs).sum(dim=1))**0.5).mean(), which will push all of 4 values to 0, why ?
Looking forward to your reply. Thank you!
In your paper you describe InverseForm is used to compare b_pred
with b_gt
. The latter is obtained from running a Sobel filter on the ground truth segmentation . Is there any particular reason we don't do the same to obtainb_pred
? Meaning, we could have just taken the predicted segmentation map, ran a sobel operation on that, and fed that as b_pred
.
Put formally in your notation:
b_gt
= sobel(y_gt)
So why not also
b_pred
= sobel(y_pred)
thank you for your great work
but could you please tell me how to visualize the result and save them?
thank you.
How can I use this loss function in TensorFlow? Which portions need to be recoded?
we can use F.affine_grid
and F.grid_sample
complete grid generator, How to implement the Homemorphic transformation?thanks
Hi! I have read your awesome paper and the appendices, and I have a few questions:
2.if I have trained stn with resnet using the ImageNet , the how to train inverseForm net, In my opinion it looks like this:
a) first get the rgb input image from ImageNet, then get the gray edges using sobel filter.
b) Freeze the stn, Using rgb input image to get the theta and the affine matrix, and use gray edges to get the affine edges by the affine matrix.
c) Get the theta hat from the inverseForm Net by using the gray edges and the affine gray edges. Then get the loss from theta and theta hat
Is that correct?
3. Does it matter if I use different net(resnet or mobilenet) to train the stn net?
4. Whats's the loss function between the theta from stn and the theta hat from inverseForm, L2? or L1
Looking forward to your reply. Thank you!
Hi, from your code I guess the visualization code is :
prediction = assets['predictions']
prediction_cpu = np.array(prediction)
label_out = np.zeros_like(prediction)
submit_fn = '{}.png'.format(img_name)
for label_id, train_id in cfg.DATASET.id_to_trainid.items():
label_out[np.where(prediction_cpu == train_id)] = label_id
cv2.imwrite(os.path.join(self.save_dir, submit_fn), label_out)
but when I execute it, there is no picture saved, and the saving path is right, could you help me? Thanks
Hello,
I wanted to ask if you would be willing to release the weights of the edge-head that is added to the default semantic segmentation architecture.
Hi,
In your code, the class InverseTransform2D calculates the loss of boundary distance. You use "inputs = torch.ge(inputs, 0.5).float()" to get boundary map. But it seems that the backward is interrupted. Please make sure that this item of loss can be backward. Cause I think the required_grad attribute of mean_square_inverse_loss is False.
Why are two models saved at the end of the inference? Are these two models (last and best ) the same as the "hrnet48_OCR_IF_checkpoint.pth " ?
Hello! I am a novice and I have 2 questions. I would very appreciate it if you could answer.
1.Tere are many datas in https://www.cityscapes-dataset.com/downloads/, witch shoud I download.
Hi,I have two confusions about the code:
Hi! Based on your paper, I understand that you have retrained the IFLoss separately for each dataset (as specified in your supplementary materials). I am presently using a custom aerial imagery dataset for my experiments and would like to follow the same protocol of retraining IFLoss on this dataset. I hope you can offer some insights on how I could go about doing this. Even better, I would appreciate it if you could share the training script, which would save a great deal of time for me.
Also, I'm also curious as to why you have retrained the IFLoss for each dataset separately. Since the Inverseform network only used the GT seg masks to train, are the structure and shape of the GT seg masks so drastically different between the various datasets that IFLoss requires retraining on each dataset separately?
Looking forward to your reply. Thank you!
Thanks for your great work! And since you provide excellent performance on cityscapes benchmark, I just wonder whether I can use your pretrained checkpoints for semantic segmantation on Kitti dataset. If it supports, I'd greatly appreciate it for your generous advice for how to adjust your dataset config and so on.
Hi, I have a question about the loss code. In the function of ImageBasedCrossEntropyLoss2d, the result of nll_loss.weight is no difference no matter the batch_weights is setted for True or False. Is this the original purpose that the Ith target(target[i]) should be as the input to the weights calculate when the batch_weights is setted for False?
While using the given command in README
"python -m torch.distributed.launch --nproc_per_node=8 experiment/validation.py --output_dir "/path/to/output/dir" --model_path checkpoints/hrnet48_OCR_IF_checkpoint.pth --arch "ocrnet.HRNet" --hrnet_base "48" --has_edge True"
getting below error
usage: validation.py [-h] [--tag TAG] [--no_run] [--interactive] [--no_cooldir] [--farm FARM] exp_yml
** All cmd line arguments are given as "unrecognized arguments"
I feel confused to integrate IF module in my network
Dear authors,
Thanks for your great work! I am interested in how you conducted your tiling operations, and select your hyperparameters for training the net. You mentioned discussions about those are in the appendix. Could you kindly remind me where can I find the appendix materials for this paper? Thanks!
why I got 84.8 miou on test data with the pretrained model?
but the paper got 85.6
Hello
How are you?
Thanks for contributing to this project.
Could u share the full training code or code snip?
Thanks
Hello!
I want to know where is the function named get_aspp in models/basic.py?
I can not find it in models/utils.py.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.