stevenyangyj / deep-head-pose-lite Goto Github PK
View Code? Open in Web Editor NEWA lite-version hopenet for head pose estimation with PyTorch
License: Apache License 2.0
A lite-version hopenet for head pose estimation with PyTorch
License: Apache License 2.0
Thank you for public your model and code.
Do you test your model in AFLW2000? How about the test result? I follow the test code in https://github.com/natanielruiz/deep-head-pose, and I get a not good result by shuff_epoch_120.pkl:Yaw: 19.8251, Pitch: 9.0400, Roll: 8.1506. I don't know why.
During testing, I replace x = x.mean([2, 3])
to x = x.mean(3).mean(2)
in stable_hopenetlite.py because my torch version is 0.4.1. I don't know whether it is the reason for bad result.
Hope to see your test result.
Thank you very much!
Hi Can you send the demo code for yaw , pitch and roll part. My email id is [email protected]
Dear author, tnanks for sharing so wonderful work. I want to do a head pose task in wild world. The hopenet runs too slow and I found your work. I test it and it runs so fast, but the performance is poor. So I want to train my own model based on your work. I can't find train details in your github page. Can you provide some help on training? How to train model?Thanks!
Hello. I am currently planning to utilize your model for my research project, and I have successfully run your model. However, I am unsure how to process the output results from the neural network. I have referenced the "deep-head-pose" code to use your model, but the output results differ significantly from the original "deep-head-pose" one, which produces correct angle results. Of course, this could also be due to incorrect image pre-processing on my part.
I would greatly appreciate some guidance in these areas, and it would be even better if you could provide a code sample for testing on the dataset or recognizing a single image using the models of this project. Additionally, my email address is [email protected], if you would like to reach out to me.
By the way, the following is the code I use for image pre-processing, could you please help me identify any errors if there are any?
def image_file(transform, path, image_mode='RGB'):
img = Image.open(path)
img = img.convert(image_mode)
if transform is not None:
img = transform(img)
return img
def get_image(path):
transformations = transforms.Compose([transforms.Resize(224),
transforms.CenterCrop(224), transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])]
)
img = image_file(transformations, path, image_mode='RGB')
img = torch.unsqueeze(img, 0)
return img
# img.shape = (1, 3, 224, 224)
img = get_image(path_to_image)
# some code ......
yaw, pitch, roll = model(images)
Hi @OverEuro can you point out the learning rates for different layers in training ?
optimizer = torch.optim.Adam([{'params': get_ignored_params(model), 'lr': 0}, {'params': get_non_ignored_params(model), 'lr': args.lr}, {'params': get_fc_params(model), 'lr': args.lr * 5}], lr = args.lr)
more training details epoches, lr, alpha, batch size for training stuff (shufflenet)?
Hello @OverEuro,
Thank you for this implementation. I have a few doubts that I wanted to clarify.
Does this model support head pose estimation under dim-lit conditions?
I also fail to understand the need to transform the input image to 224 x 224 as done in the test code
transformations = transforms.Compose([transforms.Scale(224), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])
Is this done because the model was trained on images whose dimensions were in that range or is there some other reason?
Also, are these transformations required while using shuff_epoch_120.pkl? If yes, are the values mentioned for normalization ( mean and standard deviation ) the same for shuff_epoch_120.pkl?
Thank you.
why Regression loss coefficient. is 0.001, deep-head-pose set alpha is 1 or 2
Is there a working example and can you please help me with the useage , can we use the same commands as in regular hopenet or different ,please help
@OverEuro
Thank you for your great work.
I am a freshman and when I use your lastest hopenet-lite model with model shuff_epoch_120.pkl, just follow the sample as
stable_hopenetlite.shufflenet_v2_x1_0(), I got the error blow:
result = self.forward(*input, **kwargs)
File "/Users/liutianyuan/git/deep-head-pose-lite/stable_hopenetlite.py", line 131, in forward
x = x.mean([2, 3]) # globalpool
TypeError: mean() received an invalid combination of arguments - got (list), but expected one of:
Hope to see your help. Thank you !
I found that pre_yaw、pre_pitch、pre_roll have 66 numbers,and num_bins =66.Why not three output value?
Hi,
I ran original hopenet model and hopenet-lite model one after another. Though results were different(as supposed) but couldn't find any significant speed up in lite model. Did you do any speed comparison from your end ?
@OverEuro Thanks for your great work. When I run the code with your pretrained model shffle_epoch_120.pkl .
It takes average 616ms for each image including dlib face detection. the speed is far slower than you said. why?
I have test the detect time for dlib face detection. It is about 480ms for one image. But if you want to get the pose, you should detect face first. So your 35FPS is not include the face detection?
Hi, thank you for your visiting.I have a problem about the model input, can you give an easy simple for me.
when i get an input named img, the shape of it is (224,224,3) in Tensor.
What should I do the pre deal with the input?
Looking forward to you reply! Have a nice day.
Really a good job!!! I want to know which is the trained model of your project.And can you provided the model of original paper by baidu yun instead of google drive,Thanks!!!
I randomly selected several pictures for testing in the 300W_LP data set, and found that the Angle calculation was all wrong, and the CPU time of single picture was 150ms. Have you tested it by yourself?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.