Extracting 3D information from a 2D image and reconfiguring it into 3D is called "Structure from Motion". (structure = 3D structure, motion = Camera pose) Structure from Motion(SfM) proceeds with 3D reconstruction of the Structure, as shown in the picture from a sequential set of images. It is about how to reconstruct the 3D screen structure accurately, and how to find the camera pose (where the picture was taken), camera intrinsic parameter, and the extrinsic parameter.
This project is a Python implementation of SfM with only two view. The code is based on openCV and numpy implementation.
git clone https://github.com/sunwoochan/sfm.git
Download data for example dataset and run in sfm.ipynb
38 images of nutella and intrinsic parameter
- Start from 2 views
- Feature extraction with SIFT
- Camera initialization : Find Essential matrix with RANSAC
- Suppose canonical camera setup, one camera matrix P = [ I | 0 ] (3x4 Matrix)
- From derivation, there can be 4 possible candidates for P��
- Triangulate 2D point correspondences from first two views, and physically verified P��
- Triangulation : Lift 2D points to 3D space using camera matrices
- Given a set of (noisy) matched points : {xi, x��i}
- Camera matrices : P, P��
- Estimate the 3D point : X
- Structure and motion adjustment (a.k.a. Bundle adjustment)