Hi, first thanks for the nice work and for releasing it open source. <p dir="auto"

Including a new loss in the computation graph about multi_part_assembly HOT 6 CLOSED

wuziyi616 commented on June 19, 2024

Including a new loss in the computation graph

from multi_part_assembly.

Comments (6)

Wuziyi616 commented on June 19, 2024 1

For the simple example you gave at the beginning, it is actually expected to have no grad_fn. This is because data_dict['data_id'] itself doesn't require gradient. On the other hand, can you try something like custom_loss = F.mse_loss(pred_trans)? In this case, pred_trans is a model output prediction, so it should require gradient

from multi_part_assembly.

Wuziyi616 commented on June 19, 2024

Hi, thanks for your interest in our work! The loss should be added in this function, which I believe you have already done. Since you can also print it and do have the loss_w in the config file, I highly suspect maybe there is a bug in your loss implementation, which blocks the gradient? Do you mind posting your loss implementation as well?

from multi_part_assembly.

freerafiki commented on June 19, 2024

Hi - thanks for the quick answer.
Yes, I added the loss there. So, I also thought my implementation could be the problem, but even when using a simple function (just creating a tensor and using this as a "fake" loss) the same thing happens.

Short example:

Let me explain, if I add in base_model.py

def compute_custom_loss(self, data_dict):
        custom_loss = torch.Tensor(len(data_dict['data_id'])).cuda()
        return custom_loss

def _calc_loss(self, out_dict, data_dict):
    ... # here your code
    custom_loss = self.compute_custom_loss(data_dict)
    loss_dict = {
                'trans_loss': trans_loss,
                'rot_pt_cd_loss': rot_pt_cd_loss,
                'transform_pt_cd_loss': transform_pt_cd_loss,
                'custom_loss': custom_loss,
            }

I still get

loss: trans_loss, value: tensor([[0.2590, 0.1225]], device='cuda:0', grad_fn=<StackBackward0>)
loss: rot_pt_cd_loss, value: tensor([[0.0046, 0.0063]], device='cuda:0', grad_fn=<StackBackward0>)
loss: transform_pt_cd_loss, value: tensor([[0.0482, 0.0021]], device='cuda:0', grad_fn=<StackBackward0>)
loss: custom_loss, value: tensor([[7.6154e+31, 3.0854e-41]], device='cuda:0')
loss: rot_loss, value: tensor([[0.4776, 0.6484]], device='cuda:0', grad_fn=<StackBackward0>)
loss: rot_pt_l2_loss, value: tensor([[0.0117, 0.1338]], device='cuda:0', grad_fn=<StackBackward0>)

So the loss is not attached to any gradient. And since the loss is just one line with the creation of the tensor, it should have no error (or at least pytorch could tell me something?)

Long example:

The whole loss is a bit of a mess at the moment and it is not really finished, since it does not work I did not finish the implementation. Actually I was going to implement two losses and this is the first one, but the idea (of the first one) is the following:

I have a method which calculates distance between the centers of mass of the two fragments (assuming we have 2 pieces for now to make things easier) and also select the points 'facing' each other (it does this by creating a virtual cylinder between the two centers of mass and selecting the point within this cylinder)
once I have the distances, I also get a rough idea about the shape of the two fragments calculating min and max along the three axis
then I calculate the loss having the distance between centers of mass as "prediction" and the average size of the fragment ((min + max) / 2) as the "target" which should push the transformation to move the fragments apart from each other (but not too far)

Let's assume the two methods to find points, distance and average size are working. If interested, I can post the whole code, but I think it becomes too long and part of another discussion on how to do this, since it is still very inefficient now, and it may go out of the scope of this discussion, but do let me know if that would be better, maybe I can later open another issue to discuss different kinds of losses.

def find_facing_points_and_distance(self, data_dict, radius=0.5, tolerance_dist=10e5):
   ...
   return facing_points, dist_euclidean, found_something
def get_bbox_avg_size(self, points3D):
   ...
   return avg_size

Then I calculate the loss as:

def compute_min_dist_loss(self, facing_info, data_dict, max_loss=1, dist_sigma=10e-5):
   dists_cm = torch.Tensor([dist_cm, dist_cm]).cuda()
   target_dists = torch.Tensor([bbox_avg_size_A, bbox_avg_size_B]).cuda()
   loss_fn = torch.nn.L1Loss(reduce=False)
   loss_value = loss_fn(dists_cm, target_dists)
   return loss_value

Even using this (the torch.nn.L1Loss from pytorch) I still do not get any gradient. Initially, my idea was to normalize the distance of the centers of mass (dists_cm) and have an exponential function to compute the loss (something like torch.clip(torch.exp(-((dists_cm_normalized)/dist_sigma)), 0, 1) where dist_sigma is a parameter depending on the size of the objects).

So, my idea was to simplify things and get at least one example to work (let's say the custom_loss with only a fake empty tensor, or a simple loss with always 0, it does not matter) and once I am sure that it works build on top of it, clean the code to find the facing points and distances and calculates the loss on the minimum distance in a more meaningful way.

from multi_part_assembly.

Wuziyi616 commented on June 19, 2024

I think your loss implementation is problematic. The two components involved in the L1 loss come from

dists_cm = torch.Tensor([dist_cm, dist_cm]).cuda()

and

target_dists = torch.Tensor([bbox_avg_size_A, bbox_avg_size_B]).cuda()

Obviously, you are converting some numpy values to torch tensors? These numpy values are detached from the computation graph, thus not having grad_fn. What you should do is that in your helper function to calculate dist_info and bbox_avg, you should keep them always torch tensors (and probably on CUDA as well). Then, the gradient can flow from the loss to the model

from multi_part_assembly.

freerafiki commented on June 19, 2024

It makes a lot of sense now. Using your suggestion custom_loss = F.mse_loss(pred_trans) I already get custom_loss tensor(0.0205, device='cuda:0', grad_fn=<MseLossBackward0>) which looks to be what I wanted (now I have to fix the pipeline).

The problem is that if I want to compute a loss based on the center of mass of the two fragments (or on other properties which derives from the transformation proposed from the network at that iterations, but are not directly stored in pred_trans), I need to find a way to connect the tensors. But I guess if I manage to get the points of the point cloud after transformation using pred_trans and then compute the loss on those points, the connection could work anyway, right?

I will try this (applying predicted transformation and then estimating minimum distance) step by step to keep the flow connected! Thanks, now the problem is clear!

from multi_part_assembly.

Wuziyi616 commented on June 19, 2024

Yes, I'm happy to help and good luck!

from multi_part_assembly.

Including a new loss in the computation graph about multi_part_assembly HOT 6 CLOSED

Comments (6)

Short example:

Long example:

Related Issues (11)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs