Comments (5)
In approximate joint training method you train both rpn and the detection head simultaneously. The point is that you don't pass gradients from the detection head to rpn.
In that case you need to detach an output of rpn from a computational graph (simply rpn_output.detach()
in pytorch) and pass in to the detection head. If you don't detach the output it becomes non-approximate joint training method.
from simple-faster-rcnn-pytorch.
ok, we use rpn_output.detach(). but why?
is it possible to derivate roi() w.r.t the coordinate?
d(roi(feature_map , Rois))/d{x1,y1,x2,y2} = exist?
i mean the crooping part of the roi pool.
d(feature_map [x1:x2 , y1:y2])/d{x1,y1,x2,y2} = exist?
from simple-faster-rcnn-pytorch.
ok, we use rpn_output.detach(). but why?
If rpn output is detached you don't propagate gradients from the detection head to rpn. In that way the detection head is just a function of crops (but not a whole input image and anchor boxes parameters), this is what the approximate joint method does. You can think about it as if you take your image dataset, extract and cache crops made by rpn once and then train the detection head on them.
is it possible to derivate roi() w.r.t the coordinate?
Yes, it's possible. The main reason, why the detection head and rpn in the paper were trained "separately" is lack of computational resources i assume.
Nowadays we can train all parts of such models at the same time, which is intuitively better.
from simple-faster-rcnn-pytorch.
I apologize for my many question. but i am confused and i cant give my answer during any research.
but roi pooling involves non-differentiable operations like indexing(quantizing the coordinate(like 3.5) to integers(3)). However why we detaching the proposals, during backpropagation. how the gradients do flow from the detector back into the RPN and feature extraction network? i dont uderstand this is unnecessary detaching proposal when gradients cant be flowing from roi pooling layer to rpn head and automatically are stoped.
on other hand unlike roi align, outputs of roi pooling has not directly related with coordinates(proposals). (Actually, I did not find a
mathematically relation between roi_output and inputs(coordinates).)
i.e mathematically relation beetwen roi-pool outputs and{x1,y1,x2,y2}.
So again is not necessary detaching proposal when there is not relationship beetwen roi pooling output and coordinate inputs.
if d(roi_pool_outputs)/d{x1,y1,x2,y2} are not even exist why we should detach the {x1,y1,x2,y2} to become constant??
i realy confused.
from simple-faster-rcnn-pytorch.
The trick is that in joint training method you don't get derivatives wrt coordinates from rpn.
Actually there are two ways to train faster rcnn:
a) Train rpn and the detection head in separate way. Going back to the days when people mostly don't have enough computational resources to train both parts in parallel, the recipe was simple: train single rpn, then from training data you extract crops, predicted by pretrained rpn. On the extracted crops you finally train the detection head. The method when you detach the rpn output is just the way to simulate separate training of the both parts in a single forward-backward step.
b) Train all parts of the model at the same time. In that method the detection head output becomes a function of the input image (comparing to the method a, where you have two separate functions wrt to the input image and crops of the feature map). In a part of the model where you make crops from the rpn output you don't take gradients wrt to the coordinates of the crops. You can think about this operation as a simple element-wise multiplication of feature map and a binary mask where 1 represents the pixels of the crop. This trick makes gradient flow from the detection head to the rpn.
from simple-faster-rcnn-pytorch.
Related Issues (20)
- Why do I need to copy a formal parameter in your codes? Is there some trick?(data/util.py bbox = bbox.copy())It's common in C++, but in python, i'm very curious.
- How to train Faster R-CNN on my own custom dataset and changing the RPN loss functions? HOT 2
- ValueError: need at least one array to stack HOT 3
- 关于RPN网络softmax HOT 2
- connection error HOT 1
- Can I test this model for real-time object detection? Is there a demo for it? HOT 1
- Without the incoming socket you cannot receive events from the server or register event handlers to your Visdom client. HOT 4
- In train.py line: 76
- 训练过程的loss HOT 2
- train mine dataset had an error!!! HOT 2
- Visdom and aws Sagemaker - output ?
- 训练好的模型在哪里保存啊? HOT 1
- out of memory 训练的时候显存一直在增长 HOT 4
- 怎么打印fps HOT 1
- approximate joint training method problem
- How can I cite your work in my thesis?
- Convert to onnx
- Dependencies versions
- Missing key(s) in state_dict & Unexpected key(s) in state_dict
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from simple-faster-rcnn-pytorch.