alexppppp / keypoint_rcnn_training_pytorch Goto Github PK
View Code? Open in Web Editor NEWHow to Train a Custom Keypoint Detection Model with PyTorch (Article on Medium)
License: MIT License
How to Train a Custom Keypoint Detection Model with PyTorch (Article on Medium)
License: MIT License
First off, thanks for the tutorial! You're a real lifesaver. I'm trying to get evaluate
to work at the moment. I have num_keypoints=4
. The model trains smoothly when I just comment out evaluate
during training, but I need to evaluate the performance now so I'm trying to get that to work. I did what you instructed in your tutorial:
Update. Itβs possible not to edit
pycocotools/cocoeval.py
file in pycocotools library to changekpt_oks_sigmas
, but to editcoco_eval.py
file, as Diogo Santiago suggested:
# self.coco_eval[iou_type] = COCOeval(coco_gt, iouType=iou_type)
coco_eval = COCOeval(coco_gt, iouType=iou_type)
coco_eval.params.kpt_oks_sigmas = np.array([.5, .5]) / 10.0
self.coco_eval[iou_type] = coco_eval
Since I have num_keypoints=4
, I instead wrote np.array([.5, .5, .5, .5]) / 10.0
for kpt_oks_sigmas
.
But I'm still getting the same error: ValueError: operands could not be broadcast together with shapes (4,) (17,)
Hoping you can help me with this! π€
Could you add an open license like a MIT or Apache license to enable use in opensource projects and other things?
What software did you use to get glue_tubes_keypoints_dataset_134imgs annotations json file
Hello,
I appreciate the excellent work. I have been using the model for research. I am struggling with high latency when running inference on the model. I am trying to reach production level speed during testing (a good number would be around 1s for prediction on one image). It would be helpful to understand how we can reach high speed when running prediction.
Thank you
Which label tool i have to use for custom dataset preparation ? please help me , Thanking you in advance.
Hi, I trained on model using custom Data, but now i am unable to load that custom model(.pth) for prediction.
Can you please provide a inference example, Basically the method to load the model back.
def get_model(num_keypoints, weights_path=None):
anchor_generator = AnchorGenerator(sizes=(32, 64, 128, 256, 512), aspect_ratios=(0.25, 0.5, 0.75, 1.0, 2.0, 3.0, 4.0))
model = torchvision.models.detection.keypointrcnn_resnet50_fpn(pretrained=False,
pretrained_backbone=True,
num_keypoints=num_keypoints, # 4 in my case
num_classes = 3, # Background is the first class, object is the second class
rpn_anchor_generator=anchor_generator)
if weights_path:
state_dict = torch.load(weights_path)
model.load_state_dict(state_dict)
return model
torch.save(model.state_dict(), 'custom_keypointsrcnn_weights.pth')
import torch
import torchvision
from torchvision.models.detection.rpn import AnchorGenerator
anchor_generator = AnchorGenerator(sizes=(32, 64, 128, 256, 512), aspect_ratios=(0.25, 0.5, 0.75, 1.0, 2.0, 3.0, 4.0))
m = torchvision.models.detection.keypointrcnn_resnet50_fpn(pretrained=False,pretrained_backbone=False,num_keypoints=4,num_classes = 3, rpn_anchor_generator=anchor_generator)
m.load_state_dict(torch.load("custom_keypointsrcnn_weights.pth"))
Traceback (most recent call last):
File "", line 1, in
File "C:\Python39\lib\site-packages\torch\nn\modules\module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for KeypointRCNN:
Missing key(s) in state_dict: "backbone.fpn.inner_blocks.0.weight", "backbone.fpn.inner_blocks.0.bias", "backbone.fpn.inner_blocks.1.weight", "backbone.fpn.inner_blocks.1.bias", "backbone.fpn.inner_blocks.2.weight", "backbone.fpn.inner_blocks.2.bias", "backbone.fpn.inner_blocks.3.weight", "backbone.fpn.inner_blocks.3.bias", "backbone.fpn.layer_blocks.0.weight", "backbone.fpn.layer_blocks.0.bias", "backbone.fpn.layer_blocks.1.weight", "backbone.fpn.layer_blocks.1.bias", "backbone.fpn.layer_blocks.2.weight", "backbone.fpn.layer_blocks.2.bias", "backbone.fpn.layer_blocks.3.weight", "backbone.fpn.layer_blocks.3.bias", "rpn.head.conv.weight", "rpn.head.conv.bias".
Unexpected key(s) in state_dict: "backbone.fpn.inner_blocks.0.0.weight", "backbone.fpn.inner_blocks.0.0.bias", "backbone.fpn.inner_blocks.1.0.weight", "backbone.fpn.inner_blocks.1.0.bias", "backbone.fpn.inner_blocks.2.0.weight", "backbone.fpn.inner_blocks.2.0.bias", "backbone.fpn.inner_blocks.3.0.weight", "backbone.fpn.inner_blocks.3.0.bias", "backbone.fpn.layer_blocks.0.0.weight", "backbone.fpn.layer_blocks.0.0.bias", "backbone.fpn.layer_blocks.1.0.weight", "backbone.fpn.layer_blocks.1.0.bias", "backbone.fpn.layer_blocks.2.0.weight", "backbone.fpn.layer_blocks.2.0.bias", "backbone.fpn.layer_blocks.3.0.weight", "backbone.fpn.layer_blocks.3.0.bias", "rpn.head.conv.0.0.weight", "rpn.head.conv.0.0.bias".
Traceback (most recent call last):
File "D:\Projects\custom_KP\inference_kp.py", line 47, in
model = get_model(4,'custom_keypointsrcnn_weights.pth')
File "D:\Projects\custom_KP\inference_kp.py", line 35, in get_model
model.load_state_dict(state_dict)
File "C:\Python39\lib\site-packages\torch\nn\modules\module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for KeypointRCNN:
Missing key(s) in state_dict: "backbone.fpn.inner_blocks.0.weight", "backbone.fpn.inner_blocks.0.bias", "backbone.fpn.inner_blocks.1.weight", "backbone.fpn.inner_blocks.1.bias", "backbone.fpn.inner_blocks.2.weight", "backbone.fpn.inner_blocks.2.bias", "backbone.fpn.inner_blocks.3.weight", "backbone.fpn.inner_blocks.3.bias", "backbone.fpn.layer_blocks.0.weight", "backbone.fpn.layer_blocks.0.bias", "backbone.fpn.layer_blocks.1.weight", "backbone.fpn.layer_blocks.1.bias", "backbone.fpn.layer_blocks.2.weight", "backbone.fpn.layer_blocks.2.bias", "backbone.fpn.layer_blocks.3.weight", "backbone.fpn.layer_blocks.3.bias", "rpn.head.conv.weight", "rpn.head.conv.bias".
Unexpected key(s) in state_dict: "backbone.fpn.inner_blocks.0.0.weight", "backbone.fpn.inner_blocks.0.0.bias", "backbone.fpn.inner_blocks.1.0.weight", "backbone.fpn.inner_blocks.1.0.bias", "backbone.fpn.inner_blocks.2.0.weight", "backbone.fpn.inner_blocks.2.0.bias", "backbone.fpn.inner_blocks.3.0.weight", "backbone.fpn.inner_blocks.3.0.bias", "backbone.fpn.layer_blocks.0.0.weight", "backbone.fpn.layer_blocks.0.0.bias", "backbone.fpn.layer_blocks.1.0.weight", "backbone.fpn.layer_blocks.1.0.bias", "backbone.fpn.layer_blocks.2.0.weight", "backbone.fpn.layer_blocks.2.0.bias", "backbone.fpn.layer_blocks.3.0.weight", "backbone.fpn.layer_blocks.3.0.bias", "rpn.head.conv.0.0.weight", "rpn.head.conv.0.0.bias".
Thank you for your work.
Which labeling and annotation tool do you use?
Hi there,
What's the name of the annotation format of the dataset? And/or do you have a converter to make them as COCO format (single json file with all annotations per set)?
Many thanks
I ran your notebook exactly as it showed in Google Colab and am getting the following error.... can you help?
ValueError: operands could not be broadcast together with shapes (2,) (17,)
Hi, first of all. Thank you for sharing your excellent work. I tried to modify a code to train a multi classes detection.
In this line. Does it allow to change to my own multi classes??? thank you.
bboxes_labels_original = ['Glue tube' for _ in bboxes_original]
Congratulations for this job. It is nice project. I follow your directory for train. It works when I have two keypoints. Assuming I have 10 points but It didn't work when I didn't mark all the keypoints in the images. How i can ?
Hi, @alexppppp .
I met this error when train with this code.
I used my own dataset which have 6 keypoints & 1 box in image.
Epoch: [0] [ 0/56] eta: 0:41:06 lr: 0.000019 loss: 9.5321 (9.5321) loss_classifier: 0.7595 (0.7595) loss_box_reg: 0.0000 (0.0000) loss_keypoint: 8.0790 (8.0790) loss_objectness: 0.6910 (0.6910) loss_rpn_box_reg: 0.0026 (0.0026) time: 44.0517 data: 0.0325
Epoch: [0] [55/56] eta: 0:00:44 lr: 0.001000 loss: 7.7045 (8.4640) loss_classifier: 0.0419 (0.2365) loss_box_reg: 0.0010 (0.0107) loss_keypoint: 7.2650 (7.6186) loss_objectness: 0.4400 (0.5934) loss_rpn_box_reg: 0.0027 (0.0048) time: 42.6397 data: 0.0408
Epoch: [0] Total time: 0:41:48 (44.7877 s / it)
creating index...
index created!
[W ParallelNative.cpp:214] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
ValueError Traceback (most recent call last)
Input In [17], in
21 train_one_epoch(model, optimizer, data_loader_train, device, epoch, print_freq=1000)
22 lr_scheduler.step()
---> 23 evaluate(model, data_loader_test, device)
25 # Save model weights after training
26 torch.save(model.state_dict(), './weights/nz_krcnn_res50.pth')
File /usr/local/lib/python3.8/site-packages/torch/autograd/grad_mode.py:28, in _DecoratorContextManager.call..decorate_context(*args, **kwargs)
25 @functools.wraps(func)
26 def decorate_context(*args, **kwargs):
27 with self.class():
---> 28 return func(*args, **kwargs)
File ~/Music/dataZ/meca_nz_krcnn/engine.py:102, in evaluate(model, data_loader, device)
100 res = {target["image_id"].item(): output for target, output in zip(targets, outputs)}
101 evaluator_time = time.time()
--> 102 coco_evaluator.update(res)
103 evaluator_time = time.time() - evaluator_time
104 metric_logger.update(model_time=model_time, evaluator_time=evaluator_time)
File ~/Music/dataZ/meca_nz_krcnn/coco_eval.py:43, in CocoEvaluator.update(self, predictions)
41 coco_eval.cocoDt = coco_dt
42 coco_eval.params.imgIds = list(img_ids)
---> 43 img_ids, eval_imgs = evaluate(coco_eval)
45 self.eval_imgs[iou_type].append(eval_imgs)
File ~/Music/dataZ/meca_nz_krcnn/coco_eval.py:194, in evaluate(imgs)
192 def evaluate(imgs):
193 with redirect_stdout(io.StringIO()):
--> 194 imgs.evaluate()
195 return imgs.params.imgIds, np.asarray(imgs.evalImgs).reshape(-1, len(imgs.params.areaRng), len(imgs.params.imgIds))
File /usr/local/lib/python3.8/site-packages/pycocotools/cocoeval.py:148, in COCOeval.evaluate(self)
146 elif p.iouType == 'keypoints':
147 computeIoU = self.computeOks
--> 148 self.ious = {(imgId, catId): computeIoU(imgId, catId)
149 for imgId in p.imgIds
150 for catId in catIds}
152 evaluateImg = self.evaluateImg
153 maxDet = p.maxDets[-1]
File /usr/local/lib/python3.8/site-packages/pycocotools/cocoeval.py:148, in (.0)
146 elif p.iouType == 'keypoints':
147 computeIoU = self.computeOks
--> 148 self.ious = {(imgId, catId): computeIoU(imgId, catId)
149 for imgId in p.imgIds
150 for catId in catIds}
152 evaluateImg = self.evaluateImg
153 maxDet = p.maxDets[-1]
File /usr/local/lib/python3.8/site-packages/pycocotools/cocoeval.py:229, in COCOeval.computeOks(self, imgId, catId)
227 dx = np.max((z, x0-xd),axis=0)+np.max((z, xd-x1),axis=0)
228 dy = np.max((z, y0-yd),axis=0)+np.max((z, yd-y1),axis=0)
--> 229 e = (dx2 + dy2) / vars / (gt['area']+np.spacing(1)) / 2
230 if k1 > 0:
231 e=e[vg > 0]
ValueError: operands could not be broadcast together with shapes (6,) (2,)
What's wrong to me ?
Thanks in advance.
Best,
@bemoregt.
I used glue datasets and custom datasets ,but met the same question:
File "D:/program/keypoint_rcnn_training_pytorch-main/trainer.py", line 164, in
train_one_epoch(model, optimizer, data_loader_train, device, epoch, print_freq=1000)
File "D:\program\keypoint_rcnn_training_pytorch-main\engine.py", line 31, in train_one_epoch
loss_dict = model(images, targets)
File "E:\python38\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "E:\python38\lib\site-packages\torchvision\models\detection\generalized_rcnn.py", line 99, in forward
detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets)
File "E:\python38\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "E:\python38\lib\site-packages\torchvision\models\detection\roi_heads.py", line 740, in forward
assert t["labels"].dtype == torch.int64, "target labels must of int64 type"
KeyError: 'labels'
Hi, first of all thank you for sharing your great job. I am trying to use a modified version of your code in order to train on a custom dataset with a different number of keypoints. After 5 epochs as you have done in your example. I have modified the code in order to use a different number of keypoints (6 keypoints): ClassDataset and the evaluation code (kpt_oks_sigmas for 6 keypoints).
I obtain a model that gets a good detection of the object (bounding box) but the keypoints are not well located. For example:
This images show that the detection is getting trained but the keypoints are not. Moreover at the evaluation phase I always get values equal to zero:
IoU metric: keypoints
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = -1.000
I have revised the dataset and it seems well annotated, so I don't know where the errors are. Any idea? Thank you in advance.
Hi,
First, thanks for your really nice tutorial!
I tried to reproduce it and raised an error at the call to the function evaluate, at the end of the training loop:
assert set(annsImgIds) == (set(annsImgIds) & set(self.getImgIds()))
'Results do not correspond to current coco set'
After exploring, I managed to fix it by replacing the line 100 in the engine.py script.
I replaced
res = {target["image_id"]: output for target, output in zip(targets, outputs)}
by the line
res = {(target["image_id"]).tolist()[0]: output for target, output in zip(targets, outputs)}
The newline cast the image_id (formerly a tensor du to the ClassDataset definition) into an int, to match the id of the coco_evaluator.
Not sure if it was the best way to fix this issue but it might be enough, and it could help another one facing this issue !
I'm trying to train the key point RCNN model on a dataset containing images which sometime contain annotations, but also sometime doesn't contain annotations. The annotation file then looks like: {'bboxes':[], 'keypoints':[]}. This gives the error "too many indices for tensor of dimension 1" while trying to calculate the target["area"]. How can I make this code also work for images without annotations?
import torch
import torchvision
from torchvision.models.detection.anchor_utils import AnchorGenerator
images, targets = next(iterator)
images = list(image.to(device) for image in images)
with torch.no_grad():
model.to(device)
model.eval()
output = model(images)
print("Predictions: \n", output)
traced_model = torch.jit.trace(model, output)
#traced_model = torch.jit.trace(model, example_input)
traced_model.save('traced_model.pt')
AttributeError: 'str' object has no attribute 'shape'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.