gengdavid / pytorch-cpn Goto Github PK

A PyTorch re-implementation of CPN (Cascaded Pyramid Network for Multi-Person Pose Estimation)

License: GNU General Public License v3.0

Python 100.00%

pytorch deep-learning deep-neural-networks keypoint-estimation keypoint-localization pose-estimation computer-vision

pytorch-cpn's Issues

about the number of the output keypoints?

hi, thanks for your pytorch code!
I have seen your code about test a model,but i dont know why the number of the output keypoints is always 51(17 keypoints), The way I print it is:
if len(single_result) != 0:

                    single_result_dict['image_id'] = int(ids)

                    single_result_dict['category_id'] = 1

                    single_result_dict['keypoints'] = single_result

                    **print(len(single_result_dict['keypoints']))**

                    single_result_dict['score'] = float(det_scores[b]) * v_score.mean()

                    full_result.append(single_result_dict)

And I notice that the keypoints coodr in the resulting file(result.json) don't have zero values.

RuntimeError: The size of tensor a (512) must match the size of tensor b (256) at non-singleton dimension 1

While training I am getting this error

<ipython-input-67-b0b9c64f728a> in forward(self, x)
     94         print("")
     95 
---> 96         out += residual
     97 
     98         out = self.relu(out)

RuntimeError: The size of tensor a (512) must match the size of tensor b (256) at non-singleton dimension 1

The code block is the origin of the error

class Bottleneck(nn.Module):
    expansion = 4

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(Bottleneck, self).__init__()
        self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
                               padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(planes)
        self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
        self.bn3 = nn.BatchNorm2d(planes * 4)
        self.relu = nn.ReLU(inplace=True)
        self.downsample = downsample 
        self.stride = stride
 
    def forward(self, x):
        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out += residual
        
        out = self.relu(out)

        return out

When I printed the size of out and residual it was this

torch.Size([12, 256, 96, 72])
torch.Size([12, 256, 96, 72])

torch.Size([12, 256, 96, 72])
torch.Size([12, 256, 96, 72])

torch.Size([12, 256, 96, 72])
torch.Size([12, 256, 96, 72])

torch.Size([12, 512, 48, 36])
torch.Size([12, 512, 48, 36])

torch.Size([12, 512, 48, 36])
torch.Size([12, 512, 48, 36])

torch.Size([12, 512, 48, 36])
torch.Size([12, 512, 48, 36])

torch.Size([12, 512, 48, 36])
torch.Size([12, 512, 48, 36])

torch.Size([12, 1024, 24, 18])
torch.Size([12, 1024, 24, 18])

torch.Size([12, 1024, 24, 18])
torch.Size([12, 1024, 24, 18])

torch.Size([12, 1024, 24, 18])
torch.Size([12, 1024, 24, 18])

torch.Size([12, 1024, 24, 18])
torch.Size([12, 1024, 24, 18])

torch.Size([12, 1024, 24, 18])
torch.Size([12, 1024, 24, 18])

torch.Size([12, 1024, 24, 18])
torch.Size([12, 1024, 24, 18])

torch.Size([12, 2048, 12, 9])
torch.Size([12, 2048, 12, 9])

torch.Size([12, 2048, 12, 9])
torch.Size([12, 2048, 12, 9])

torch.Size([12, 2048, 12, 9])
torch.Size([12, 2048, 12, 9])

torch.Size([12, 256, 12, 9])
torch.Size([12, 512, 12, 9])

How can I solve this issue?

Config.py

Hello, I want to ask the use of parameter "symmetry" in the config file.

Other joints

Awesome repo!

I've trained using ground truth boxes and it works okay but the ap is just lower.

Do you think that the the detection box should affect training seeing as it's increase so significantly in the image preprocessing?

Do you think this network architecture needs to be improved if I'm using more joints than just COCO?

code can only detect one person in one image?

when I print(inputs.size(0)),which in 'test.py' line 83.
I find that output is 128.But test batch size is 128,it means all the 128 images in the val2017 only have one person respectively？I think maybe because the code just detect one person's keypoints in a picture?
And when I want to test my own picture ,I also find that I can only detect one person in a picture even though my picture have three people(because I don't have my own picture's gt_bbox,so I have to reashape my picture myself to fit the code)
So I want to know if the code can just detect one person's keypoints in one picture but don't support multi-persons?

Train.py

I ran train.py with an error that prompted me: Runtime Error: CUDA error: out of memory. I haven't made any changes at present. Why is this problem? How to solve it?

gydx@gydx-HP-Z6-G4-Workstation:~/A-YFT/pytorch-cpn/256.192.model$ python3 train.py
Initialize with pre-trained ResNet
successfully load 318 keys
/home/gydx/.local/lib/python3.5/site-packages/torch/nn/functional.py:52: UserWarning: size_average and reduce args will be deprecated, please use reduction='none' instead.
warnings.warn(warning.format(ret))
Total params: 104.55MB

Epoch: 1 | LR: 0.00050000
/usr/local/lib/python3.5/dist-packages/skimage/transform/_warps.py:105: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15.
warn("The default mode, 'constant', will be changed to 'reflect' in "
/usr/local/lib/python3.5/dist-packages/skimage/transform/_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images.
warn("Anti-aliasing will be enabled by default in skimage 0.15 to "
/usr/local/lib/python3.5/dist-packages/skimage/transform/_warps.py:105: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15.
warn("The default mode, 'constant', will be changed to 'reflect' in "
/usr/local/lib/python3.5/dist-packages/skimage/transform/_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images.
.....................
File "/home/gydx/.local/lib/python3.5/site-packages/torch/nn/modules/upsampling.py", line 123, in forward
return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners)
File "/home/gydx/.local/lib/python3.5/site-packages/torch/nn/functional.py", line 1985, in interpolate
return torch._C._nn.upsample_bilinear2d(input, _output_size(2), align_corners)
RuntimeError: CUDA error: out of memory
gydx@gydx-HP-Z6-G4-Workstation:~/A-YFT/pytorch-cpn/256.192.model$

Different data augmentation with tf-cpn

Thanks for your code.
I find that you use a different data augmentation with tf-cpn.
Did you compare your data augmentation with tf-cpn's?
What is the result？
Thank you another time

how can i run the code， and see the result on my picture？

questions about test.py

hi, thanks for your pytorch code!
I have seen your code about test a model, but I don't know why 4x and 4y should plus 2(line 115 and 116)?
And I have also seen that tf-version plus 2 too.
Thanks!

pytorch-cpn/256.192.model/test.py

Lines 110 to 117 in 48696a9

 if ln > 1e-3: 

 x += delta * px / ln 

 y += delta * py / ln 

 x = max(0, min(x, cfg.output_shape[1] - 1)) 

 y = max(0, min(y, cfg.output_shape[0] - 1)) 

 resy = float((4 * y + 2) / cfg.data_shape[0] * (details[b][3] - details[b][1]) + details[b][1]) 

 resx = float((4 * x + 2) / cfg.data_shape[1] * (details[b][2] - details[b][0]) + details[b][0]) 

 v_score[p] = float(r0[p, int(round(y)+1e-10), int(round(x)+1e-10)])

about output result

Hi, David, Sorry to bother you.
I am a little confused about the output result of key points.
In the pytorch-cpn/256.192.model/test.py file, line 117, you write output
as follows:
v_score[p] = float(r0[p, int(round(y)+1e-10), int(round(x)+1e-10)])
single_result.append(resx)
single_result.append(resy)
single_result.append(1)
I guess the last 1 stands for the confidence or probability, I think it is not
so reasonable that the probability is always 1.
I guess v_score[p] has similar meaning,
why not use v_score[p] instead.

About the structure of refineNet

pytorch-cpn/networks/refineNet.py

Line 65 in 48696a9

def _predict(self, input_channel, num_class):

Why the final layer appends a BN layer? Why the output is normalized?
Could you give me a hint? Thank you!

Training with other configurations.

Hi @GengDavid,

Thanks for the great implementation. I'm eager collaborate with you to test other configurations. I have 2 x 1080 and 2 x 1080ti. I can borrow more if needed. Looking forward to your response!

Refine Net

I guess maybe there is some problems in implementation of refine net.
In your refineNet.py, you define the forward pass as follows:
def forward(self, x):
refine_fms = []
for i in range(4):
refine_fms.append(self.cascadei)
out = torch.cat(refine_fms, dim=1)
out = self.final_predict(out)
return out
I think you should inverse the x, eg: x = x[::-1], because x[0] is the smallest feature map, and x[3] is biggest feature map. And there are 3 bottlenecks after smallest feature map , 0 bottleneck after biggest feature map according to paper.

Has anyone trained on the MPII dataset?

Unable to extract pretrained model archive

Hi,

I'm trying to use the pretrained model but none of the tar files seem to be in working order. I get the following error:
An error occurred while extracting files.

I'm using Ubuntu 16.04.

mobilenet is not fast

thanks for your code.
when i replace resnet with mobilenet,i find the speed of model is slower..
i'm so confused . i run model in GPU(titan X)
do you know the reason?
the following is my code:

def conv_dw(in_channels, out_channels, kernel_size=3, padding=1, stride=1, dilation=1):
return nn.Sequential(
nn.Conv2d(in_channels, in_channels, kernel_size, stride, padding, dilation=dilation, groups=in_channels, bias=False),
nn.BatchNorm2d(in_channels),
nn.ReLU(inplace=True),

    nn.Conv2d(in_channels, out_channels, 1, 1, 0, bias=False),
    nn.BatchNorm2d(out_channels),
    nn.ReLU(inplace=True),
)

Why the bias at FPN upsamle conv is 'True'?

globalNet.py

    def _upsample(self):
        layers = []
        layers.append(torch.nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True))
        layers.append(torch.nn.Conv2d(256, 256,
            kernel_size=1, stride=1, bias=True))
        layers.append(nn.BatchNorm2d(256))

        return nn.Sequential(*layers)

About using myself val_det

If I want to use myself detection results, should I transform the annotations to some particular format? I have read your code, and it seems that it's not mentioned.

pre-trained model

Can pre-trained model be loaded during training? I don't see the pretrained model loading code in the train.py

I think this code have a erro about makefile

When I use command make, the process will say a error
the error is that gcc: error: pycocotools/_mask.c: No such file or directory
So how can I solver this problem?

nn.Upsampling( ) and pytorch version

I do appreciate for your excellent work, and I get a warning in my training:
( my PyTorch version is 0.4.1 in experiment )

/root/anaconda3/envs/CPNs/lib/python3.5/site-packages/torch/nn/modules/upsampling.py:122: UserWarning: nn.Upsampling is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.Upsampling is deprecated. Use nn.functional.interpolate instead.")

I do some researching, and I find that nn.Upsampling( ) is no longer used in PyTorch>=0.4.1.
It contradicts with the install requirement.
And it seems , the model performance will be reduced if I ignore this warning.

How can I deal with it?

Looking forward to your reply!

关于Evaluation results on COCO minival dataset

请问下Detection Result和 Detection Result的区别是什么呢？

where is the file of "COCO_2017_train.json" ,"COCO_2017_val.json", "val_dets.json"? i can't find them in the coco dataset

FileNotFoundError: [Errno 2] No such file or directory: '/media/agent/data/xcj/pytorch-cpn/256.192.model/../data/COCO2017/annotations/COCO_2017_train.json'

@YoungZiyu

@YoungZiyu
Sure, you can find training log here

Originally posted by @GengDavid in #3 (comment)

a question about test.py

Thanks for your work.But I have a question at the line 115 in test.py.
Why you use 4y+2 but 4y? Can you explain the meaning of +2 ?Thanks.

about the cpu utilized percent

When I run the train. py ,the cpu utilized percent is more than 300%. How can I solve the problem?

About the utils/imutils.py line:41

I think the following code is confusing or not correct,
heatmap /= am / 255
because batchnorm is the last layer of the predict net, the single element of heatmap should within the range of 0-1. I think the code should be correct as followed,
heatmap /= am.
But I'm not totally sure I am right, can you explain it?

The name of pretrained model is misspelled

Hi.

I found one of your pretrained models has wrong name.
The parameter file of COCO.res101.384x288.CPN on Google Drive is CPN101_385x288.pth.tar.

Thanks for your nice work.

How can get the high score?

I finish my CPN network like this, but it is only 0.553 mAP. Are there someone could give me advice about it?

an unexpected keyword argument

I ran the train.py,but got a error.'TypeError: init() got an unexpected keyword argument 'align_corners''.It occured at globalNet.py,line 56.
I don't know how to fix it,can you help me?thank you

About mscocoMulti.py

There are 4 vars: target15 target11 target9 target7
what's the means of them?

about the train.py file line:119

for global_output, label in zip(global_outputs, targets):
num_points = global_output.size()[1]
global_label = label * (valid > 1.1).type(torch.FloatTensor).view(-1, num_points, 1, 1)
global_loss = criterion1(global_output,
torch.autograd.Variable(global_label.cuda(async=True))) / 2.0
loss += global_loss
global_loss_record += global_loss.data.item()
上面的代码是不是有错误：
global_outputs should be reversed?

where is mptest.py in “python3 mptest.py -t 'CPN256x192'”

Yeet

https://github.com/philips77/antidote/tree/master

Image test demo

How to use the pre-training network model to visualize the image test results? Is there a demo?

half of the output predictions are wrong

I am running custom images through this model. All the images have been pre-proceed and cropped as to just include the human, and I've removed all annotation files and hardcoded the information that the annotation files used to provide. Also, I'm only processing images where there is a single person in-frame.

For some reason, half of the output predictions are just wrong--they are a mess. The other half of the output predictions look perfect. The wrong outputs are almost all identical with only slight, barely noticeable differences in joint positions. Also, If I feed in, say, a folder of 1000 images, the predictions on images 1-64 will be perfect, 65-128 will be wrong, 129-192 will be perfect, and 193-256 will be wrong, and this pattern continues. This pattern remains the same regardless of the input data.

Any idea why this is happening? I'm happy to provide more info about the issue. Thanks.

Ye

https://github.com/elixir-lsp/vscode-elixir-ls/blob/b645862891c3d8c92b0a286848be8a999f29072b/src/test/suite/index.ts

some question about the human detector.

in this paper, I see that CPN used the image cropped in raw image with the bounding box of FPN as input. Where is the FPN in your reimplement ?

Using just GlobalNet

Hello! I want to speed up the testing process, so I'm thinking to use just GlobalNet, without RefineNet. Do you think this could work without losing too much AP? Also how should I do it? Thank you very much!

target.cuda(async=True),在async=True处出现SyntaxError: invalid syntax

相关资料不是很直观，想问下我直接把

        refine_target_var = torch.autograd.Variable(target7.cuda(async=True))
        valid_var = torch.autograd.Variable(valid.cuda(async=True))

改成如下可以吗？

        refine_target_var = torch.autograd.Variable(target7.cuda())
        valid_var = torch.autograd.Variable(valid.cuda())

test.py gets stuck when computing output

Hi,

I've followed the instructions in the README thoroughly and have double checked all the steps. However, when running test.py, this line of code:

global_outputs, refine_output = model(input_var)

seems to never finish. In case it simply takes a very long time to run, I've also made a small testing subset of the val2017 folder and the annotations file with just 5 images, yet this line of code still seems to run forever. Upon a keyboard interrupt, the error message seems to have to do with threads acquiring lock (not sure if this is useful to know). Any idea why? Thanks.

Color Normalization Issue

Hi @GengDavid,

I've found the self.pixel_means is changed in every iteration when calling getitem due to modification of mean variable in color_normalize function. As a result, our expected color normalization will not take effect after several samples iterations because the mean value decreases to [0, 0, 0].

This issue infects both training and testing phase.

See detailed printed log below:

checking pixel means init: [122.7717 115.9465 102.9801]
=> loaded checkpoint 'checkpoint/epoch32checkpoint.pth.tar' (epoch 32)
testing...
checking pixel means getitem: [122.7717 115.9465 102.9801]
checking pixel means getitem: [0.48145765 0.45469216 0.40384353]
checking pixel means getitem: [0.00188807 0.00178311 0.0015837 ]
checking pixel means getitem: [7.40419296e-06 6.99257450e-06 6.21058869e-06]
checking pixel means getitem: [2.90360508e-08 2.74218608e-08 2.43552498e-08]

Please help to double check it.

about the test data

Hello, I used your method to train, and then used test.py to test, but found that the test accuracy is very low, I downloaded the pre-training model, tested, still very low.
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.188.
I suspect that my test data is wrong, I am using COCO2017/val2017,COCO_2017_val.json,person_keypoints_val2017.json
The data is stored according to data/README.md. I want to ask what your test data is. The test results are so good. Thank you.

Achieve the accuracy of the paper

good job
retrain and Achieve the accuracy of the paper？

Results is

I used the weight file of 10 epochs of training to test the image. Why is the accuracy so low? Did I train too little?

	if ln > 1e-3:
	x += delta * px / ln
	y += delta * py / ln
	x = max(0, min(x, cfg.output_shape[1] - 1))
	y = max(0, min(y, cfg.output_shape[0] - 1))
	resy = float((4 * y + 2) / cfg.data_shape[0] * (details[b][3] - details[b][1]) + details[b][1])
	resx = float((4 * x + 2) / cfg.data_shape[1] * (details[b][2] - details[b][0]) + details[b][0])
	v_score[p] = float(r0[p, int(round(y)+1e-10), int(round(x)+1e-10)])

gengdavid / pytorch-cpn Goto Github PK

pytorch-cpn's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs