Comments (6)
For the simple example you gave at the beginning, it is actually expected to have no grad_fn. This is because data_dict['data_id']
itself doesn't require gradient. On the other hand, can you try something like custom_loss = F.mse_loss(pred_trans)
? In this case, pred_trans
is a model output prediction, so it should require gradient
from multi_part_assembly.
Hi, thanks for your interest in our work! The loss should be added in this function, which I believe you have already done. Since you can also print it and do have the loss_w
in the config file, I highly suspect maybe there is a bug in your loss implementation, which blocks the gradient? Do you mind posting your loss implementation as well?
from multi_part_assembly.
Hi - thanks for the quick answer.
Yes, I added the loss there. So, I also thought my implementation could be the problem, but even when using a simple function (just creating a tensor and using this as a "fake" loss) the same thing happens.
Short example:
Let me explain, if I add in base_model.py
def compute_custom_loss(self, data_dict):
custom_loss = torch.Tensor(len(data_dict['data_id'])).cuda()
return custom_loss
def _calc_loss(self, out_dict, data_dict):
... # here your code
custom_loss = self.compute_custom_loss(data_dict)
loss_dict = {
'trans_loss': trans_loss,
'rot_pt_cd_loss': rot_pt_cd_loss,
'transform_pt_cd_loss': transform_pt_cd_loss,
'custom_loss': custom_loss,
}
I still get
loss: trans_loss, value: tensor([[0.2590, 0.1225]], device='cuda:0', grad_fn=<StackBackward0>)
loss: rot_pt_cd_loss, value: tensor([[0.0046, 0.0063]], device='cuda:0', grad_fn=<StackBackward0>)
loss: transform_pt_cd_loss, value: tensor([[0.0482, 0.0021]], device='cuda:0', grad_fn=<StackBackward0>)
loss: custom_loss, value: tensor([[7.6154e+31, 3.0854e-41]], device='cuda:0')
loss: rot_loss, value: tensor([[0.4776, 0.6484]], device='cuda:0', grad_fn=<StackBackward0>)
loss: rot_pt_l2_loss, value: tensor([[0.0117, 0.1338]], device='cuda:0', grad_fn=<StackBackward0>)
So the loss is not attached to any gradient. And since the loss is just one line with the creation of the tensor, it should have no error (or at least pytorch could tell me something?)
Long example:
The whole loss is a bit of a mess at the moment and it is not really finished, since it does not work I did not finish the implementation. Actually I was going to implement two losses and this is the first one, but the idea (of the first one) is the following:
- I have a method which calculates distance between the centers of mass of the two fragments (assuming we have 2 pieces for now to make things easier) and also select the points 'facing' each other (it does this by creating a virtual cylinder between the two centers of mass and selecting the point within this cylinder)
- once I have the distances, I also get a rough idea about the shape of the two fragments calculating min and max along the three axis
- then I calculate the loss having the distance between centers of mass as "prediction" and the average size of the fragment (
(min + max) / 2
) as the "target" which should push the transformation to move the fragments apart from each other (but not too far)
Let's assume the two methods to find points, distance and average size are working. If interested, I can post the whole code, but I think it becomes too long and part of another discussion on how to do this, since it is still very inefficient now, and it may go out of the scope of this discussion, but do let me know if that would be better, maybe I can later open another issue to discuss different kinds of losses.
def find_facing_points_and_distance(self, data_dict, radius=0.5, tolerance_dist=10e5):
...
return facing_points, dist_euclidean, found_something
def get_bbox_avg_size(self, points3D):
...
return avg_size
Then I calculate the loss as:
def compute_min_dist_loss(self, facing_info, data_dict, max_loss=1, dist_sigma=10e-5):
dists_cm = torch.Tensor([dist_cm, dist_cm]).cuda()
target_dists = torch.Tensor([bbox_avg_size_A, bbox_avg_size_B]).cuda()
loss_fn = torch.nn.L1Loss(reduce=False)
loss_value = loss_fn(dists_cm, target_dists)
return loss_value
Even using this (the torch.nn.L1Loss
from pytorch) I still do not get any gradient. Initially, my idea was to normalize the distance of the centers of mass (dists_cm
) and have an exponential function to compute the loss (something like torch.clip(torch.exp(-((dists_cm_normalized)/dist_sigma)), 0, 1)
where dist_sigma
is a parameter depending on the size of the objects).
So, my idea was to simplify things and get at least one example to work (let's say the custom_loss with only a fake empty tensor, or a simple loss with always 0, it does not matter) and once I am sure that it works build on top of it, clean the code to find the facing points and distances and calculates the loss on the minimum distance in a more meaningful way.
from multi_part_assembly.
I think your loss implementation is problematic. The two components involved in the L1 loss come from
dists_cm = torch.Tensor([dist_cm, dist_cm]).cuda()
and
target_dists = torch.Tensor([bbox_avg_size_A, bbox_avg_size_B]).cuda()
Obviously, you are converting some numpy values to torch tensors? These numpy values are detached from the computation graph, thus not having grad_fn. What you should do is that in your helper function to calculate dist_info and bbox_avg, you should keep them always torch tensors (and probably on CUDA as well). Then, the gradient can flow from the loss to the model
from multi_part_assembly.
It makes a lot of sense now. Using your suggestion custom_loss = F.mse_loss(pred_trans)
I already get custom_loss tensor(0.0205, device='cuda:0', grad_fn=<MseLossBackward0>)
which looks to be what I wanted (now I have to fix the pipeline).
The problem is that if I want to compute a loss based on the center of mass of the two fragments (or on other properties which derives from the transformation proposed from the network at that iterations, but are not directly stored in pred_trans
), I need to find a way to connect the tensors. But I guess if I manage to get the points of the point cloud after transformation using pred_trans
and then compute the loss on those points, the connection could work anyway, right?
I will try this (applying predicted transformation and then estimating minimum distance) step by step to keep the flow connected! Thanks, now the problem is clear!
from multi_part_assembly.
Yes, I'm happy to help and good luck!
from multi_part_assembly.
Related Issues (11)
- Pretrained models to reproduce paper's results HOT 4
- May I suggest a more general library for saving point clouds, other than open3d? HOT 4
- Install problem HOT 7
- The 'info file' for splitting train/val list in trivial training HOT 2
- Some clarification questions HOT 20
- Problem with scripts/vis.py HOT 10
- Pretrained models HOT 23
- A bug in line 264 of base_model.py HOT 1
- Unbroken models HOT 6
- Experimental settings HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from multi_part_assembly.