GithubHelp home page GithubHelp logo

Translation of the camera about spec HOT 4 CLOSED

mkocabas avatar mkocabas commented on May 29, 2024
Translation of the camera

from spec.

Comments (4)

mkocabas avatar mkocabas commented on May 29, 2024 2

Hi @Dene33,

Here is the line where we construct the camera extrinsics transformation for rendering:

camera_pose[:3, 3] = camera_rotation @ camera_translation
. You can see the camera translation there.

Basically, CamCalib estimates the camera rotation and focal length, and then SPEC model estimates the camera translation wrt the human body. One can interpret this as the translation of the body wrt the camera as well. Latter one is more useful for multi-person cases when you want to assume that the camera is located at the origin and human body instances are translated wrt the camera. Hope this helps!

from spec.

RichardChen20 avatar RichardChen20 commented on May 29, 2024

Hi! @mkocabas

I noticed that CamCalib can predict camera pose: cam_rotmat, cam_int, cam_vfov, cam_pitch, cam_roll, cam_focal_length.
Q1: Is cam_rotmat caculated using cam_pitch and cam_roll?
Q2: Why Yaw angle is not needed?
Q3: why does cam_int mean? Why is it a 3x3 matrix?

Besides, from your smpl results (SPEC), I also noticed that there is pred_cam and pred_cam_t,
Q1: What's the setting about pred_cam? Are they the euler angles defining how to rotate the mesh towards camera axis x?

I hope you can help me solve these questions. Thanks a lot!

from spec.

mkocabas avatar mkocabas commented on May 29, 2024

Q1: Yes, it is. And here is the function where we convert pitch, roll -> cam_rotmat: SPEC/cam_params.py at d2fe2c264c72c98a5f479fc36f74bdd5f45427da · mkocabas/SPEC (github.com)
Q2: Yaw angle is ill-posed to estimate from single images. We have horizon as a common reference to estimate roll and pitch, but there is no such reference to estimate yaw in that sense.
Q3: It means the camera intrinsic parameters constructed as such: https://github.com/mkocabas/SPEC/blob/d2fe2c264c72c98a5f479fc36f74bdd5f45427da/spec/utils/cam_params.py#L39-43
Q4: They are the estimated camera translation, pred_cam is [s, tx, ty], pred_cam_t is [tx, ty, tz]. And here is how we convert from pred_cam to pred_cam_t: PARE/smpl_cam_head.py at master · mkocabas/PARE (github.com). Hence they are not related to camera rotation.

from spec.

RichardChen20 avatar RichardChen20 commented on May 29, 2024

@mkocabas Thank you for your answer. I think i'm very close to understand all the details while still not fully understanding some details. I hope you can help me more!
QQ截图20220907161028
Here is one example here, I manually set the bbox cover the whole input image, the estimated pitch=-11.5, roll=-1.1, pred_cam_t = [1.768, 0.031, 0.1115]. I'm not sure about the relationships between.
I vis the estimated 3d mesh, I notice that the root joint is very close to the origin, which coordinate does the 3d mesh in? How to convert it into camera coordinate? Assume 3d joints X, does RX+t convert it into camera coordinate?

QQ截图20220907170103

Besides, how does the pitch and roll caculated? Are the settings same as that in the shown picture? I still don't figure out the setting of camera coordinate, does the axis Zc face towards the bbox center? SPEC estimates a horizon line, do you mean that the camera center is placed on this horizotal plane, then we should use pitch and roll to rotate camera coordinate to let axis Zc face the human body or the bbox center?

I hope you can help me, thanks a lot!

from spec.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.