shahroudy / nturgb-d Goto Github PK

View Code? Open in Web Editor NEW

727.0 727.0 177.0 154 KB

Info and sample codes for "NTU RGB+D Action Recognition Dataset"

MATLAB 76.52% Python 23.48%

nturgb-d's People

Stargazers

Watchers

Forkers

shinexunju xiangfasong arasharchor caomw fabienbaradel jren2012 satoshirobatofujimoto wanghuogen seanny123 parthaeth chrisyang ohclll hawklucky nyyznyyz1991 mbencherif chengwuliang autoom debasmitdas jizhihang runningdongxu xiao-g1023 wxw420 archenroot zzutk xunfeng2zkj albarqouni drink7 fabricejumel mamilat srijandas07 ymqian1785 cupsandbottles inwoonglee jacv050 orapradeep liyaguang freywang pandinosaurus fanbenchao traviscxy minminaicode bowrian wlin0511 mikeshihyaolin acodec moynstain perfectizer dingding2018 1950247272 zqsiat tom-zhj bolinpu baucheng aditya2g mejdidallel 1153710405 zoukai214 lunalulu baochun123 seasky100 wishinger-li nikkiccq lshiwjx ugenteraan kech96 mooooona niechuanmu makotohonda mxguo gaozikai hendriktpl angleboy8 wangtiancai chengzihome fesianxu chengyibin manqiaoyue xiehaizheng zzzzlalala woxiaosa 5l1v3r1 sunnydreamrain cuc-sk zywbupt xujinglin arthur151 anirudh257 unsw-interactive kangxi9 weichenzh thomas-yx zhangyi-srv khurramhashmi gabe-yhlee splendidsummer jagadish-kumaran wenyh1616 aakgun zinchenkovasil advanceflow

nturgb-d's Issues

NTU in NTU RGB D

Hi,

Thank you very much for your wonderful work.
I am curious to know what NTU stands for?

Need help in plotting joints on RGB on Depth image

Hi Respected community of GitHub,
I am a research student working on Human action recognition and need help in:

Plotting joints point on RGB and Depth frames (pictures)
Making skeleton using joints on RGB and Depth frames (pictures)
I have used show_skeleton_on_depthmaps.m & show_skeleton_on_RGB_frames.m but unable to understand how to watch output in term of picture or video ( joints on RGB and Depth frames) or not understand how to use "### outputvideofilename"

“FloorClipPlane” data

Hi, could you provide “FloorClipPlane” data for each camera in each setup so that we can transform body joints from camera coordinates to world coordinates? Thanks.

Hello,
Thanks for your favorable work.
I cannot sent request for the access to the dataset on your website provided by you,
because it shows '500 - Internal server error'.
Could you probably help me to get Action Recognition Dataset--3D skeletons (body joints) which is 5.8 GB ?
I am an undergraduate student in the UK. I would like to use it to do the research..
My email address is [email protected]

negative joint coordinates

Some of the joint coordinates are negative. Can you please explain how these coordinates were calculated/ normalized?

I want to use my data on a model pre trained with NTURGB-D, so i need to get it in the same format

get orignial data from normalized data

It would be great if I could known how to get the original projectable code from the normalized x, y coordinates? In other words it will be helpful if you can provide the code for normalization of x,y coordinates

Typo in README.md

repsectively

About skeleton orientation data

hello, i got a problem when i use the 4 orientations (orientationW, orentationX, orientationY, orientationZ) in the skeleton data. I want to calculate the Euler rotation of each joint.

So, i think those 4 orientations are every joint quaternions, am I right?
or this 4 orientations are relative quaternion between current joint and last joint?

how multiple persons are handled In P-LSTM

In the referred paper "NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis", A base model called P-LSTM has been proposed. How does the proposed model handle the data when there are two persons present in a single frame?

I couldn't find this information in the paper.

Thanks in advance.

Microsoft Kinect v2. intrinsic parameters

I want the following intrinsic parameters for each camera used.
If not available how can I get the default intrinsic parameters?

{
'id': '',
'center': [],
'focal_length': [],
'radial_distortion': [],
'tangential_distortion': [],
'res_w': 0,
'res_h': 0,
'azimuth': 0, # Only used for visualization
}

Error when applying datasets

When I apply to acquire the dataset through the website , I got the following error, everyone know how to solve it?

Microsoft OLE DB Provider for ODBC Drivers error '80040e14'

[Microsoft][ODBC Microsoft Access Driver] Syntax error in INSERT INTO statement.

/datasets/requesterAddProc.asp, line 134

Ask some details about NTU-RGBD dataset

Hello, I'm a graduate student collecting a dataset about skeleton and pose estimation. I want to label the true 3d joint position in the world coordinate. I want to know what is the method that your team label the 3d joint position. did your team use Kinect SDK? thanks.

Visualization 3D skeletons

Hi, Thank for your great job, I want to visualize the NTU 3D joints, but I don't the NTU dataset 3D joints' format. Is it in space coordinate or camera coordinate?
Sincerely hope that you can reply? thank you in advance?

APSR-Framework code

I want to know where I can download the code of APSR-framework mentioned in NTU-RGBD120 paper. Thank you very much.

RGB and depth images processing for CNN

Hi, shahroudy, I want to know how to set the average RGB value when processing the training data before input to a network. And how to set training data and testing data. Thank you and look forward your reply.

Skeleton extraction

Dear Amir,

Great work on the NTURGB-D dataset! Your papers do not mention how the skeleton was extracted from the RGBD-D data. Could you please shed some light on that?

Thanks and best regards,
Rahul

suggest compress the depth images into video

Can I get the original IR maps

Hi,
Great datasets！It helped me a lot。
but ，my research direction now is to detect key points based on IR images and depth maps ，Can you provide the original 16 bit IR maps？

Error in Estimated Skeleton Data

Sample: "S001C001P001R001A027"
Action: Jump Up

In the RGB video, there exists one person performing the action. But there exist joint values of two persons in the skeleton data (the file named: S001C001P001R001A027.skeleton). This is the case for a few other samples also like S001C001P001R001A028, S001C001P001R001A024, S001C001P001R001A026, S001C001P001R001A029.

P.S These are the cases that I found so far in examining the data.

Is this an error? If yes, anyways to eliminate this?

Skeleton file description

Dear all,
sample *.skeleton file contain
52 1 72057594037932358 0 0 0 0 0 0 0.33916 -0.5899565 2 25 0.2410551 -0.171228 4.294691 283.0309 224.2974 1027.245 581.2726 -0.2574076 0.123495 0.9539005 -0.09254229 2 0.2948349 0.1012649 4.225985 288.0122 200.99 1041.686 513.5756 -0.2665294 0.1243754 0.9513921 -0.09135558 2 0.3484591 0.3697312 4.144117 293.2654 177.1354 1056.9 444.3341 -0.2775609 0.1362276 0.9421903 -0.1291483 2 0.3180007 0.4567953 4.108379 290.831 169.0964 1049.855 421.0554 0 0 0 0 2 0.1822655 0.321388 4.28747 278.0624 182.3637 1012.409 459.6171 -0.281578 -0.6474488 0.683909 0.1838272 2 0.07589981 0.1205466 4.396867 268.8376 199.7327 985.4812 510.1192 0.06034974 -0.5202456 0.3134047 0.7921364 2 -0.05407938 0.00915651 4.244675 257.8868 208.9526 954.2135 537.0081 -0.2676778 0.8216244 -0.430342 0.2609363 1 -0.07004753 -0.036596 4.183693 256.4259 212.9325 950.2018 548.5809 -0.34924 0.8069586 -0.2904053 0.3775105 1 0.466928 0.2580718 4.100758 304.1543 186.7378 1088.714 472.0902 -0.09214124 0.7651293 0.5880391 -0.2455546 2 0.5106769 0.02274908 4.083334 308.2432 207.7038 1100.851 532.8464 -0.06122642 0.9532829 0.07727568 0.2855372 2 0.431269 -0.1834366 3.992133 302.0117 226.5302 1083.29 587.5076 0.1165461 0.2440035 -0.23478 0.9336796 2 0.3857652 -0.2527936 4.01325 297.6592 232.7558 1070.653 605.611 0.3048452 0.2731479 0.0439797 0.9113317 2 0.1892584 -0.1588427 4.28756 278.6519 223.2652 1014.547 578.3298 -0.05109064 -0.6076694 0.7827132 -0.1244497 2 -0.08986153 -0.09259083 4.157754 254.648 217.8683 945.1703 562.9314 0.7761323 -0.2254343 -0.05312065 0.5864948 2 -0.02291749 -0.4356488 4.294297 260.5873 246.7992 962.2934 646.7827 -0.1665024 -0.450032 0.1231817 0.8686624 2 -0.08422279 -0.5146305 4.21392 255.2325 254.3715 947.0784 668.7665 0 0 0 0 2 0.287944 -0.1807353 4.230647 287.3916 225.3406 1040.106 584.2456 -0.3058956 0.6702745 0.5890362 -0.3319583 2 0.1274008 -0.2041289 3.941398 274.338 228.6485 1003.222 594.0005 0.3023287 -0.6019208 0.6110057 -0.4158854 2 0.1858304 -0.4719123 4.131247 278.9752 251.4849 1016.218 660.1216 0.2207259 0.8369746 0.222939 0.4483881 2 0.1228096 -0.5504664 4.050951 273.6197 259.4163 1001.016 683.1406 0 0 0 0 2 0.3351774 0.303002 4.167017 291.9237 183.174 1053.013 461.8513 -0.2765343 0.1310133 0.9459615 -0.1073365 2 -0.1119047 -0.07854194 4.138204 252.6663 216.6677 939.4634 559.4663 0 0 0 0 2 -0.04821506 -0.02453206 4.158201 258.3048 211.8932 955.7338 545.5428 0 0 0 0 2 0.336823 -0.3180895 4.01796 293.1675 238.667 1057.662 622.8026 0 0 0 0 2 0.4231775 -0.249493 4.055667 300.6671 232.2204 1079.216 604.0177 0 0 0 0 2 1

I would appreciate if you could add some description about each line
I do not know if data presents joins position coordinates or orientantion (or both).

Best Regards

Question regarding action span

Is there any information at all on when the action begin and ends? Or have anyone came up with a heuristics to determine which parts of the clip contains the action?

I am asking this because from what i observed, in most clips i see that the action does not immediately begin, but rather, about 1/3 to 1/2 into the clip (these numbers came from like 10 clips, i cannot say anything about the significance). Having the idle frames counted towards the action might add noise to the data..

Does the dataset provide any info regarding at what time the action happened, or do i need to resort to heuristics along the lines of what i mentioned above?

masked depth and RGB

I have seen similar quarries but would try to be more specific.
I want to align the RGB and depth image for the purposes of RGBD training.

Form the first look it seems that the depth image correspond to a rectangular region in the image which seems consistent for the entire setup. i.e. the RGB image can be simply cropped (and resized) to generate the image for which the depth is shared.

Can you provide with the parameters of such crop?
If we know the RGB to depth correspondence. The dataset is very useful for testing depth fusion single view reconstruction and tracking etc.

I understand that sharing the depth maps for full image is not memory efficient but the authors could crop the images instead to align with depth-maps and share a even smaller dataset.

Intrinsic parameters of Kinect

Hi, thank you for providing awesome dataset!
I want to re-project the depth image to the 3D point cloud.
How do I get the information of intrinsic parameters of each Kinect?

Thanks in advance for considering!

Can i just download a subset of this NTURGB-D 120 dataset

Now i am only care about a subset of this dataset (e.g. fall down), and because the total dataset is too big and most of them is not useful for me, so in order to save the disk space and download time, is there any way to choose a subset of this dataset to download ?

About requesting for the NTU RGB-D dataset

Dear friend, I can't got the dataset from the given website because I can't submit the information with the problem '500 - Internal server error'. Can you give the login ID and password to download the dataset? My email is [email protected]. And I can send you my information to prove that I will use the dataset for academic research. Thank you very much!

About preprocessing the RGB data

Dear friend, I have a question about preprocessing the RGB data with size of (1920,1024,3),I want to get a (224,224,3) input but for image with two person I don't kown how should I deal with it.
If you have used RGB data directly, how do you preprocessing them?

align RGB and depth

(2) I want to align RGB and depth frames. Are there any camera calibration data recorded?
Unfortunately no camera calibration info is recorded. However, one applicable solution for this is to use the skeletal data. For each video sample, the skeletal data includes a big number of body joints and their precise locations in both RGB and depth frames. So for each sample you have a big number of mappings. Keep in mind that the cameras were fixed during each setup (Sxxx in the file names mean this sample is from setup xxx). So for each camera at each setup you have a huge number of mappings between RGB and depth cameras (and also between the three sensors!). Finding a transformation between the cameras will be as easy as solving a linear system with a lot of known points!

Is the transformation a linear transformation?

Action-Part Semantic Relevance-aware (APSR) framework

Hi Shahroudy
Could you please provide me the Action-Part Semantic Relevance-aware (APSR) framework code for experimentation.
Thanks
Rama

txt2npy logical check for already existing files

Hi, thanks for the converter to .npy extension. It looks like the logical check on line 123 wants to be checking if each+'.npy' in alread_exist_dict, rather than each+'.skeleton.npy'. For me it was always seeing (e.g.) 'S001C001P001R001A003.skeleton.skeleton.npy', and was always overwriting the numpy files.

How to use data of less classes?

I want to use data of only 5 classes, but I don't know how to do it.
For example, I need data of class 0 ,1, 2, 3, 4, how could I implement it? THX

queries about nturgb+d_depth_masked data

Hi,
when I imread your masked depth data (e.g. MDepth-00000001.png), it shows as follows:

Could you tell me why the masked image just show the top left corner？
Thanks very much!

What does each parameter mean in raw data?

I appreciate your work very much. There are some questions that plague me.
In raw data and the "read_skeleton_file.m", there are some parameter that I can't really understand what they really mean. As follows:
body.clipedEdges
body.handLeftConfidence
body.handLeftState
body.handRightConfidence
body.handRightState
body.isResticted
body.leanX
body.leanY
joint.orientationW
joint.orientationX
joint.orientationY
joint.orientationZ
What does each parameter mean? And how can we get these parameter by sensor?
Look forward to your reply!!!

how to split the dataset

hello, your design of the dataset is very clever
I appreciate your work very much , here I want to ask 2 questions

did you use three kinects on different sdk , if so ,how to keep them work
Synchronous . if one，does one kinect SDK support 3 kinect cameras simutaneously?
2.how to split the dataset？ if according to the name， I should access the name of all the files and then I get the split method ?
Almost all the big datasets have their split file , if you can release it , a lot of people will appreciate it more

looking forward to your answer ，thank you

skeleton data is 0 in some frames ,should I delete it ?

What is NTU in NTU RGB+d ?

Hi,

Thank you very much for your wonderful work.
I am curious to know what does NTU stands for?

question about skeleton data

Hi, I'm interested in human action recognition using skeleton data.
When I look in to a data I see some misleading joint parts due to side view of camera.
Like the image below right hand is occluded but kinect camera estimates a women drinking water with two hands.
So i'm curious if it is okay with this kind of misleading joints data as a input to LSTM model.

Is there any way i can download only the skeleton data

hi, i really appreciate your work, but i wonder to know, Is there any way i can download the skeleton data only ?

.

depth and rgb registration

Hi, Is there any code to register the RGB and depth images? They are of different aspect ratio. Is there any code to get the common areas in both the images?

Baseline of NTURGB-D

thanks for your amazing work
Now I am working on NTU RGB-D, but I don't know the baseline of it.
As far as I know, the baseline of it is almost 85.5% (CVPR18), but I am not quite sure.
Anyone who knows it?

intrinsic parameters of each Kinect

Hi, thank you for providing awesome dataset!
I want to re-project 3D skeleton to 2D.
How do I get the information of intrinsic parameters of each Kinect?

Thanks in advance for considering!

About the body tracking ID

Hello, I have a confusion about the tracking ID of the skeleton.
For a subject in a video, is the tracking ID of that subject's skeleton constant for the entire video?

For e.g: In the file S010C003P025R001A060.skeleton, there is one subject initially whose tracking ID is "72057594037930227". When I jump to line 1000, I can see skeleton coordinates for two subjects, whose tracking IDs are "72057594037930243" and "72057594037930244".

What does this mean?
Are there three subjects in the video or the tracking was stopped in between and started again, giving the subjects new tracking IDs?
If the latter is the case, which new ID corresponds to the subject who was there initially in the video?

Please help me out with this as it is becoming difficult to choose the subject who is to be considered for the action if I can't figure out the corresponding subjects for each set of coordinates.

About requesting for the NTURGBD dataset

Dear friend, I have submitted the request for the NTU RGB-D dataset online on 1st, February，2019. But I don't receive any email util today(7th,February，2019) . And my Requester ID is A1941. Please let me know if I don't pass the vertification. Maybe the information I filled in the form is not complete enough. I want to reapply the dataset online, but
I can't resubmit the information with the problem '500 - Internal server error'. Can you give the login ID and password to download the dataset? My email is [email protected].
And I can send you my information to prove that I will use the dataset for academic research. Thank you very much!

how to ensure the accuracy of the keypoints

Thank you for your great work on these two data sets. Here is a question that how do you ensure the accuracy of 25 key points? Does the player need to wear some markers to play the action? If possible, could you please share some details of capturing the dataset or a simple walkthrough introducing how to use kinects V2 to capture data the same as NTU RGB+D.

New datasets split

Hi,
After downloading the new extended dataset, I found that the previous 60 classes of subjects also made new 60 class actions, but the new subject did not do the previous 60 classes, so that when the dataset is divided(eg, cross-setup eval), is there any gap when making predictions for new subjects?

Dataset for new activity

Hi,
Is it possible to add new data (new activity) to ntu dataset? If yes, is there any sample code to generate new skeleton data from videos?

nturgbd_depth_s004 is error, the data is same to nturgbd_rgb_s004

extrinsic parameters for each performer (subject)

How to get the camera extrinsic parameters for each subject similar to the following format
As you can see for each performer for the three cameras

{
'P001': [
{
'orientation': [0.1407056450843811, -0.1500701755285263, -0.755240797996521, 0.6223280429840088],
'translation': [1841.1070556640625, 4955.28466796875, 1563.4454345703125],
},
{
'orientation': [0.6157187819480896, -0.764836311340332, -0.14833825826644897, 0.11794740706682205],
'translation': [1761.278564453125, -5078.0068359375, 1606.2650146484375],
},
{
'orientation': [0.14651472866535187, -0.14647851884365082, 0.7653023600578308, -0.6094175577163696],
'translation': [-1846.7777099609375, 5215.04638671875, 1491.972412109375],
},
],
'P002': [
{
'orientation': [0.1407056450843811, -0.1500701755285263, -0.755240797996521, 0.6223280429840088],
'translation': [1841.1070556640625, 4955.28466796875, 1563.4454345703125],
},
{
'orientation': [0.6157187819480896, -0.764836311340332, -0.14833825826644897, 0.11794740706682205],
'translation': [1761.278564453125, -5078.0068359375, 1606.2650146484375],
},
{
'orientation': [0.14651472866535187, -0.14647851884365082, 0.7653023600578308, -0.6094175577163696],
'translation': [-1846.7777099609375, 5215.04638671875, 1491.972412109375],
},
],
...
}

Base-line code

Hey Amir, are there any plans to upload the code used to get the baseline or the state-of-the-art results from either "NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis" or "Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition"?

Question about 3D skeletion data?

Hello, i got a question about the relation between 3D joints data and color_x, color_y, since i want to use the inference result from simple baseline which only has the color_x, color_y to replace ground truth NTU 3D joints. However, i don't know how to transfer color_x, color_y to joint.x joint.y.

shahroudy / nturgb-d Goto Github PK

nturgb-d's People

Stargazers

Watchers

Forkers

nturgb-d's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs