GithubHelp home page GithubHelp logo

Comments (5)

m-evdokimov avatar m-evdokimov commented on June 16, 2024

In approximate joint training method you train both rpn and the detection head simultaneously. The point is that you don't pass gradients from the detection head to rpn.
In that case you need to detach an output of rpn from a computational graph (simply rpn_output.detach() in pytorch) and pass in to the detection head. If you don't detach the output it becomes non-approximate joint training method.

from simple-faster-rcnn-pytorch.

sanhai77 avatar sanhai77 commented on June 16, 2024

ok, we use rpn_output.detach(). but why?
is it possible to derivate roi() w.r.t the coordinate?

d(roi(feature_map , Rois))/d{x1,y1,x2,y2} = exist?

i mean the crooping part of the roi pool.

d(feature_map [x1:x2 , y1:y2])/d{x1,y1,x2,y2} = exist?

from simple-faster-rcnn-pytorch.

m-evdokimov avatar m-evdokimov commented on June 16, 2024

ok, we use rpn_output.detach(). but why?

If rpn output is detached you don't propagate gradients from the detection head to rpn. In that way the detection head is just a function of crops (but not a whole input image and anchor boxes parameters), this is what the approximate joint method does. You can think about it as if you take your image dataset, extract and cache crops made by rpn once and then train the detection head on them.

is it possible to derivate roi() w.r.t the coordinate?

Yes, it's possible. The main reason, why the detection head and rpn in the paper were trained "separately" is lack of computational resources i assume.
Nowadays we can train all parts of such models at the same time, which is intuitively better.

from simple-faster-rcnn-pytorch.

sanhai77 avatar sanhai77 commented on June 16, 2024

I apologize for my many question. but i am confused and i cant give my answer during any research.
but roi pooling involves non-differentiable operations like indexing(quantizing the coordinate(like 3.5) to integers(3)). However why we detaching the proposals, during backpropagation. how the gradients do flow from the detector back into the RPN and feature extraction network? i dont uderstand this is unnecessary detaching proposal when gradients cant be flowing from roi pooling layer to rpn head and automatically are stoped.
on other hand unlike roi align, outputs of roi pooling has not directly related with coordinates(proposals). (Actually, I did not find a
mathematically relation between roi_output and inputs(coordinates).)
i.e mathematically relation beetwen roi-pool outputs and{x1,y1,x2,y2}.
So again is not necessary detaching proposal when there is not relationship beetwen roi pooling output and coordinate inputs.
if d(roi_pool_outputs)/d{x1,y1,x2,y2} are not even exist why we should detach the {x1,y1,x2,y2} to become constant??

i realy confused.

from simple-faster-rcnn-pytorch.

m-evdokimov avatar m-evdokimov commented on June 16, 2024

The trick is that in joint training method you don't get derivatives wrt coordinates from rpn.

Actually there are two ways to train faster rcnn:
a) Train rpn and the detection head in separate way. Going back to the days when people mostly don't have enough computational resources to train both parts in parallel, the recipe was simple: train single rpn, then from training data you extract crops, predicted by pretrained rpn. On the extracted crops you finally train the detection head. The method when you detach the rpn output is just the way to simulate separate training of the both parts in a single forward-backward step.
b) Train all parts of the model at the same time. In that method the detection head output becomes a function of the input image (comparing to the method a, where you have two separate functions wrt to the input image and crops of the feature map). In a part of the model where you make crops from the rpn output you don't take gradients wrt to the coordinates of the crops. You can think about this operation as a simple element-wise multiplication of feature map and a binary mask where 1 represents the pixels of the crop. This trick makes gradient flow from the detection head to the rpn.

from simple-faster-rcnn-pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.