stevenyangyj / deep-head-pose-lite Goto Github PK

View Code? Open in Web Editor NEW

180.0 180.0 40.0 16.83 MB

A lite-version hopenet for head pose estimation with PyTorch

License: Apache License 2.0

Python 100.00%

deep-head-pose-lite's People

Contributors

Stargazers

Watchers

deep-head-pose-lite's Issues

about test result

Thank you for public your model and code.
Do you test your model in AFLW2000? How about the test result? I follow the test code in https://github.com/natanielruiz/deep-head-pose, and I get a not good result by shuff_epoch_120.pkl:Yaw: 19.8251, Pitch: 9.0400, Roll: 8.1506. I don't know why.
During testing, I replace x = x.mean([2, 3])to x = x.mean(3).mean(2) in stable_hopenetlite.py because my torch version is 0.4.1. I don't know whether it is the reason for bad result.
Hope to see your test result.
Thank you very much!

demo code request

Hi Can you send the demo code for yaw , pitch and roll part. My email id is [email protected]

Train problem

Dear author, tnanks for sharing so wonderful work. I want to do a head pose task in wild world. The hopenet runs too slow and I found your work. I test it and it runs so fast, but the performance is poor. So I want to train my own model based on your work. I can't find train details in your github page. Can you provide some help on training? How to train model?Thanks!

Model Output and Image Pre-processing Issues

Hello. I am currently planning to utilize your model for my research project, and I have successfully run your model. However, I am unsure how to process the output results from the neural network. I have referenced the "deep-head-pose" code to use your model, but the output results differ significantly from the original "deep-head-pose" one, which produces correct angle results. Of course, this could also be due to incorrect image pre-processing on my part.

I would greatly appreciate some guidance in these areas, and it would be even better if you could provide a code sample for testing on the dataset or recognizing a single image using the models of this project. Additionally, my email address is [email protected], if you would like to reach out to me.

By the way, the following is the code I use for image pre-processing, could you please help me identify any errors if there are any?

def image_file(transform, path, image_mode='RGB'):
    img = Image.open(path)
    img = img.convert(image_mode)
    if transform is not None:
        img = transform(img)
    return img

def get_image(path):
    transformations = transforms.Compose([transforms.Resize(224),
        transforms.CenterCrop(224), transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])]
    )
    img = image_file(transformations, path, image_mode='RGB')
    img = torch.unsqueeze(img, 0)
    return img

# img.shape = (1, 3, 224, 224)
img = get_image(path_to_image)
# some code ......
yaw, pitch, roll = model(images)

train lr

Hi @OverEuro can you point out the learning rates for different layers in training ?
optimizer = torch.optim.Adam([{'params': get_ignored_params(model), 'lr': 0}, {'params': get_non_ignored_params(model), 'lr': args.lr}, {'params': get_fc_params(model), 'lr': args.lr * 5}], lr = args.lr)

Could you provide details about training epoches, lr, alpha, batch size for training stuff (shufflenet)?

more training details epoches, lr, alpha, batch size for training stuff (shufflenet)?

Accuracy in dim-lit conditions

Hello @OverEuro,
Thank you for this implementation. I have a few doubts that I wanted to clarify.

Does this model support head pose estimation under dim-lit conditions?
I also fail to understand the need to transform the input image to 224 x 224 as done in the test code
transformations = transforms.Compose([transforms.Scale(224), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])
Is this done because the model was trained on images whose dimensions were in that range or is there some other reason?
Also, are these transformations required while using shuff_epoch_120.pkl? If yes, are the values mentioned for normalization ( mean and standard deviation ) the same for shuff_epoch_120.pkl?

Thank you.

Regression loss coefficient.？？

why Regression loss coefficient. is 0.001, deep-head-pose set alpha is 1 or 2

Useage

Is there a working example and can you please help me with the useage , can we use the same commands as in regular hopenet or different ,please help

Error about stable_hopenetlite.shufflenet_v2_x1_0()

@OverEuro
Thank you for your great work.
I am a freshman and when I use your lastest hopenet-lite model with model shuff_epoch_120.pkl, just follow the sample as
stable_hopenetlite.shufflenet_v2_x1_0(), I got the error blow:

result = self.forward(*input, **kwargs)
File "/Users/liutianyuan/git/deep-head-pose-lite/stable_hopenetlite.py", line 131, in forward
x = x.mean([2, 3]) # globalpool
TypeError: mean() received an invalid combination of arguments - got (list), but expected one of:

()
didn't match because some of the arguments have invalid types: (!list!)
(torch.dtype dtype)
didn't match because some of the arguments have invalid types: (!list!)
(int dim, torch.dtype dtype)
(int dim, bool keepdim, torch.dtype dtype)
(int dim, bool keepdim)

Hope to see your help. Thank you !

Why not three output value?

I found that pre_yaw、pre_pitch、pre_roll have 66 numbers,and num_bins =66.Why not three output value?

speed of model

Hi,
I ran original hopenet model and hopenet-lite model one after another. Though results were different(as supposed) but couldn't find any significant speed up in lite model. Did you do any speed comparison from your end ?

average time is 616ms per image?

@OverEuro Thanks for your great work. When I run the code with your pretrained model shffle_epoch_120.pkl .
It takes average 616ms for each image including dlib face detection. the speed is far slower than you said. why?

I have test the detect time for dlib face detection. It is about 480ms for one image. But if you want to get the pose, you should detect face first. So your 35FPS is not include the face detection?

About model input

Hi, thank you for your visiting.I have a problem about the model input, can you give an easy simple for me.
when i get an input named img, the shape of it is (224,224,3) in Tensor.
What should I do the pre deal with the input?
Looking forward to you reply! Have a nice day.

The trained model

Really a good job!!! I want to know which is the trained model of your project.And can you provided the model of original paper by baidu yun instead of google drive,Thanks!!!

The results were poor and took a long time

I randomly selected several pictures for testing in the 300W_LP data set, and found that the Angle calculation was all wrong, and the CPU time of single picture was 150ms. Have you tested it by yourself?

stevenyangyj / deep-head-pose-lite Goto Github PK

deep-head-pose-lite's People

Contributors

Stargazers

Watchers

Forkers

deep-head-pose-lite's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs