GithubHelp home page GithubHelp logo

svip-lab / gazefollowing Goto Github PK

View Code? Open in Web Editor NEW
101.0 7.0 23.0 11.94 MB

Code for ACCV2018 paper 'Believe It or Not, We Know What You Are Looking at!'

License: MIT License

Python 100.00%
pytorch accv2018 gaze-follow

gazefollowing's Introduction

Gaze following

PyTorch implementation of our ACCV2018 paper:

'Believe It or Not, We Know What You Are Looking at!' [paper] [poster]

Dongze Lian*, Zehao Yu*, Shenghua Gao

(* Equal Contribution)

Prepare training data

GazeFollow dataset is proposed in [1], please download the dataset from http://gazefollow.csail.mit.edu/download.html. Note that the downloaded testing data may have wrong label, so we request test2 provided by author. I do not know whether the author update their testing set. If not, it is better for you to e-mail authors in [1]. For your convenience, we also paste the testing set link here provided by authors in [1] when we request. (Note that the license is in [1])

Download our dataset

OurData is in Onedrive Please download and unzip it

OurData contains data descriped in our paper.

OurData/tools/extract_frame.py

extract frame from clipVideo in 2fps. Different version of ffmpeg may have different results, we provide our extracted images.

OurData/tools/create_video_image_list.py

extract annotation to json.

Testing on gazefollow data

Please download the pretrained model manually and save to model/

cd code
python test_gazefollow.py

Evaluation metrics

cd code
python cal_min_dis.py
python cal_auc.py

Test on our data

cd code
python test_ourdata.py

Training scratch

cd code
python train.py

Inference

simply run python inference.py image_path eye_x eye_y to infer the gaze. Note that eye_x and eye_y is the normalized coordinate (from 0 - 1) for eye position. The script will save the inference result in tmp.png.

cd code
python inference.py ../images/00000003.jpg 0.52 0.14

Reference:

[1] Recasens*, A., Khosla*, A., Vondrick, C., Torralba, A.: Where are they looking? In: Advances in Neural Information Processing Systems (NIPS) (2015).

Citation

If this project is helpful for you, you can cite our paper:

@InProceedings{Lian_2018_ACCV,
author = {Lian, Dongze and Yu, Zehao and Gao, Shenghua},
title = {Believe It or Not, We Know What You Are Looking at!},
booktitle = {ACCV},
year = {2018}
}

gazefollowing's People

Contributors

dongzelian avatar niujinshuchong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

gazefollowing's Issues

怎样检测多个人的视线?

你好,我看您的预测inference代码,每次检测都要输入人眼的位置,如果同时检测多个人呢,怎么操作?

Code visualization problem

您好,请问训练完成之后如何导入一个实例,并检测出他的目光呢。我尝试了所有的py文件均无法达到这个目的。

Hello, how can I import an instance after the training is completed and detect his gaze? I tried all the py files and could not achieve this.

人脸位置问题

   您好,又来请教了。
   在数据处理代码部分,我看到您采用的策略是通过眼部位置剪切出人脸。我见数据集里有人脸位置box数据,您为何不采用?有其他的考虑吗?因为在人脸在每个图片的尺度不同,剪切可能会导致剪切不全或者过多。
   谢谢!

####crop face
x_c, y_c = eye
x_0 = x_c - 0.15
y_0 = y_c - 0.15
x_1 = x_c + 0.15
y_1 = y_c + 0.15
if x_0 < 0:
x_0 = 0
if y_0 < 0:
y_0 = 0
if x_1 > 1:
x_1 = 1
if y_1 > 1:
y_1 = 1

what is the distance meeasurement Units?

L2 distance (Dist): The Euclidean distance between predicted gaze point and
the average of ground truth annotations. The original image size is normalized
to 1 × 1.

what is the distance measurement unit?

../GazeFollowData/test2_annotations.mat

Hello,You set mat_file='../GazeFollowData/test2_annotations.mat' in the train.py file, but I did not have it in the GazeFollowData data set I downloaded. It is test_annotations_release.txt. I replaced it with a txt file format and it didn’t work. May I ask How to solve it?

头部位置,人眼位置确定

您好,看您的论文写的第一部分的输入是人脸图以及头部位置,看您代码inference.py代码里面的输入确实是人脸图还有一个(x, y)坐标,这个坐标指的是头部位置对吗?请问您是如何表示头部位置的呢?是用两只眼睛的中点当头部位置吗?因为您制作了自己的数据集,所以您一定清楚您提取头部位置的数据是如何来的,因为我想用您的模型试着跑一下我自己的数据集,我自己的数据集目前提取了人脸框以及两只眼睛的坐标位置,现在不知道您的头部位置是如何确定的。我尝试用两只眼睛的中点位置当作头部位置,但是人眼视线的位置偏差很大。我跑出来的效果图是这样的,我的数据集的图片大部分都是人脸比较靠近摄像头,我不知道这有没有影响,如果您能尽快帮我解答,我会很感激,麻烦您了!
image

数据集下载

你好,你们自己制作的数据集有没有其他下载方式,onedrive我下载不成功,总是自动暂停。有没有百度网盘呀?谢谢!

model accuracy

Hi, I want to know that how this model perform on the Gazefollow Dataset. I train the model only to get the best accuracy --- 0.24 for L2 distance and 34 degree for angular error. But these two metrics is much woreser than those proposed on the paper "Where are they looking", which shows 0.19 for L2 distance and 24 degree for angular error. So I would like know what accuracy did you get while training and testing. And I will be very appreciate if you can share the training parameters for the best accuracy model.

Eyes coordinates and strange results

Hello and thanks for your code!

I've tried to infer your code on the webcam video and recieved some strange results. After that, i've downloaded several pictures and launched infer on them and have also recieved strange results on couple of them. Here they are. There are also two images from your repo (guy staring at the phone is from the original model's folder and guy on the skate is from your repo).
1
2
3
4
5
6
7
8
9
10
11
12

On some images gaze was detected good, but most of them are quite bad. Could the eye_x and eye_y be the problem? Since, as i understand, it should be different for different images? If so, how to get it?

What is the value of γ ?

Hi, I'm just implementing your ablation study.
There are 3 gaze mask fields on your code (gazenet.py).
γ=1, γ=2, γ=5
When you train the methods "one-scale", what is the value of γ about the gaze fields?
because the bigger one is the default setting, I think that γ is 5. but I'm not sure.

+ Could I know When does Epoch converge?

image

What is the Ang score?

Thank you for your work!

I wonder which one is the "Angular error".
average loss : [0.05870105 0.1139822 0.17268325]
average error [mean dist, angle, mean angle]: [ 0.14515077 19.66974045 18.10261505]

The Ang is 17.6 on your paper but I got score 18.10
am I misunderstanding?

Model evaluation and metrics code

I would like to thank and congratulate for the research work developed and advances achieved.
I have run and implemented my changes on the code, and ow i would like to have the same measures presented in the paper: AUC Dist MDist Ang MAng. In the original code folder there is a text_gazefollow.py file, to evaluate right? After processing the test dataset (GazeFollow dataset recasens 2015) the result was:

average loss : [tensor(0.0587, device='cuda:0') tensor(0.1137, device='cuda:0') tensor(0.1724, device='cuda:0')]
average error: [ 0.14532723 19.64980047 18.1135517 ]

How can I extract the complete metrics?

Thank you

RuntimeError: Cannot initialize CUDA without ATen_cuda library.

(pytorch) D:\Projects\Pycharm2019Projects\GazeFollow\code>python inference.py ../images/00000003.jpg 0.52 0.14
Traceback (most recent call last):
File "inference.py", line 152, in
main()
File "inference.py", line 134, in main
net.cuda()
File "D:\IDE\Anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 258, in cuda
return self._apply(lambda t: t.cuda(device))
File "D:\IDE\Anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 185, in _apply
module._apply(fn)
File "D:\IDE\Anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 185, in _apply
module._apply(fn)
File "D:\IDE\Anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 185, in _apply
module._apply(fn)
File "D:\IDE\Anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 191, in _apply
param.data = fn(param.data)
File "D:\IDE\Anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 258, in
return self._apply(lambda t: t.cuda(device))
RuntimeError: Cannot initialize CUDA without ATen_cuda library. PyTorch splits its backend into two shared libraries: a CPU library and a CUDA library; this erro
r has occurred because you are trying to use some CUDA functionality, but the CUDA library has not been loaded by the dynamic linker for some reason. The CUDA l
ibrary MUST be loaded, EVEN IF you don't directly use any symbols from the CUDA library! One common culprit is a lack of -Wl,--no-as-needed in your link argument
s; many dynamic linkers will delete dynamic library dependencies if you don't depend on any of their symbols. You can check if this has occurred by using ldd on
your binary to see if there is a dependency on *_cuda.so library.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.