Camera resectioning: Difference between revisions

Content deleted Content added

Inline

Revision as of 09:49, 11 December 2009

Camera resectioning (often called camera calibration) is the process of finding the true parameters of the camera that produced a given photograph or video. Mostly, the camera parameters were represented in a 3 × 4 matrix called camera matrix.

Parameters of camera model

Often, we use $[u\,v\,1]^{\top }$ to represent a 2D point position in Pixel coordinates. $[x_{w}\,y_{w}\,z_{w}\,]^{\top }$ is used to represent a 3D point position in World coordinates.Note: they were expressed in augmented notation of Homogeneous coordinates which is most common notation in Robotics and rigid body transforms. Referring to the pinhole camera model, a camera matrix is used to denote a projective mapping from World coordinates to Pixel coordinates.

z_{c}{\begin{bmatrix}u\\v\\1\end{bmatrix}}=A{\begin{bmatrix}R&T\end{bmatrix}}{\begin{bmatrix}x_{w}\\y_{w}\\z_{w}\\1\end{bmatrix}}

Intrinsic parameters

A={\begin{bmatrix}\alpha _{x}&\gamma &u_{0}\\0&\alpha _{y}&v_{0}\\0&0&1\end{bmatrix}}

The intrinsic matrix containing 5 intrinsic parameters. These parameters encompass focal length, image format, and principal point. The parameters $\alpha _{x}=f\cdot m_{x}$ and $\alpha _{y}=f\cdot m_{y}$ represent focal length in terms of pixels, where $m_{x}$ and $m_{y}$ are the scale factors relating pixels to distance. ^[1]

Nonlinear intrinsic parameters such as lens distortion are also important although they cannot be included in the linear camera model described by the intrinsic parameter matrix. Many modern camera calibration algorithms estimate these intrinsic parameters as well.

Extrinsic parameters

${\textbf {R}},T$ are the extrinsic parameters which denote that coordinate system transformations from 3D world coordinates to 3D camera coordinates. Equivalently, the extrinsic parameters define the position of the camera center and the camera's heading in world coordinates (although T is not the position of the camera).

Camera calibration is often used as an early stage in computer vision and especially in the field of augmented reality.

When a camera is used, light from the environment is focused on an image plane and captured. This process reduces the dimensions of the data taken in by the camera from three to two (light from a 3D scene is stored on a 2D image). Each pixel on the image plane therefore corresponds to a shaft of light from the original scene. Camera resectioning determines which incoming light is associated with each pixel on the resulting image. In an ideal pinhole camera, a simple projection matrix is enough to do this. With more complex camera systems, errors resulting from misaligned lenses and deformations in their structures can result in more complex distortions in the final image. The camera projection matrix is derived from the intrinsic and extrinsic parameters of the camera, and is often represented by the series of transformations; e.g., a matrix of camera intrinsic parameters, a 3 × 3 rotation matrix, and a translation vector. The camera projection matrix can be used to associate points in a camera's image space with locations in 3D world space.

Camera resectioning is often used in the application of stereo vision where the camera projection matrices of two cameras are used to calculate the 3D world coordinates of a point viewed by both cameras.

Some people call this camera calibration, but many restrict the term camera calibration for the estimation of internal or intrinsic parameters only.

Algorithms

There are many different approaches to calculate the intrinsic and extrinsic parameters for a specific camera setup.

Direct linear transformation (DLT) method
A classical approach is Roger Y. Tsai's Algorithm.It is a 2-stage algorithm, calculating the pose (3D Orientation, and x-axis and y-axis translation) in first stage. In second stage it computes the focal length, distortion coefficients and the z-axis translation.
Zhengyou Zhang's "a flexible new technique for camera calibration" based on a planar chess board. It is based on constrains on homography

Zhang's method

Zhang's camera calibration method^[2] employs abstract concepts like the image of the absolute conic and circular points.

Derivation

Assume we have a homography ${\textbf {H}}$ that maps points $x_{\pi }$ on a "probe plane" $\pi$ to points $x$ on the image.

The circular points $I,J=[1\,\pm j\,0]^{\top }$ lie on both our probe plane $\pi$ and on the absolute conic $\Omega _{\infty }$ . Lying on $\Omega _{\infty }$ of course means they are also projected onto the image of the absolute conic (IAC) $\omega$ , thus $x_{1}^{\top }\omega x_{1}=0$ and $x_{2}^{\top }\omega x_{2}=0$ . The circular points project as

{\begin{aligned}x_{1}&={\textbf {H}}I={\begin{bmatrix}h_{1}&h_{2}&h_{3}\end{bmatrix}}{\begin{bmatrix}1\\j\\0\end{bmatrix}}=h_{1}+jh_{2}\\x_{2}&={\textbf {H}}J={\begin{bmatrix}h_{1}&h_{2}&h_{3}\end{bmatrix}}{\begin{bmatrix}1\\-j\\0\end{bmatrix}}=h_{1}-jh_{2}\end{aligned}}

.

We can actually ignore $x_{2}$ while substituting our new expression for $x_{1}$ as follows:

{\begin{aligned}x_{1}^{\top }\omega x_{1}&=\left(h_{1}+jh_{2}\right)^{\top }\omega \left(h_{1}+jh_{2}\right)\\&=\left(h_{1}^{\top }+jh_{2}^{\top }\right)\omega \left(h_{1}+jh_{2}\right)\\&=h_{1}^{\top }\omega h_{1}+j\left(h_{2}^{\top }\omega h_{2}\right)\\&=0\end{aligned}}

which, when separating real and imaginary parts gives us

{\begin{aligned}h_{1}^{\top }\omega h_{1}&=0\\h_{2}^{\top }\omega h_{2}&=0\end{aligned}}

Since conics are symmetric matrices, $\omega =\omega ^{\top }$ and...

External links

Camera Calibration Toolbox for Matlab
Zhang's Camera Calibration Method with Software
Camera Calibration - Augmented reality lecture at TU Muenchen, Germany
Tsai's Approach
Camera calibration (using ARToolKit)
A Four-step Camera Calibration Procedure with Implicit Image Correction

References

^ Richard Hartley and Andrew Zisserman (2003). Multiple View Geometry in Computer Vision. Cambridge University Press. pp. 155–157. ISBN 0-521-54051-8.
^ Z. Zhang, "A flexible new technique for camera calibration'", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.22, No.11, pages 1330–1334, 2000

[1] Richard Hartley and Andrew Zisserman (2003). Multiple View Geometry in Computer Vision. Cambridge University Press. pp. 155–157. ISBN 0-521-54051-8.

[2] Z. Zhang, "A flexible new technique for camera calibration'", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.22, No.11, pages 1330–1334, 2000

[1]

[2]

@@ Line 20: / Line 20: @@
 & 0 & 1\end{bmatrix}</math>
-The intrinsic matrix containing 5 intrinsic parameters. These parameters encompass focal length, [[image sensor size|image format]], and [[principal point]].  The parameters <math>\alpha_{x} = f \cdot m_{x}</math> and <math>\alpha_{y} = f \cdot m_{y}</math> represent focal length in terms of pixels, where <math>m_{x}</math> and <math>m_{y}</math> are the [[scale factor]]s relating pixels to distance.
+The intrinsic matrix containing 5 intrinsic parameters. These parameters encompass [[focal length]], [[image sensor size|image format]], and [[principal point]].  The parameters <math>\alpha_{x} = f \cdot m_{x}</math> and <math>\alpha_{y} = f \cdot m_{y}</math> represent focal length in terms of pixels, where <math>m_{x}</math> and <math>m_{y}</math> are the [[scale factor]]s relating pixels to distance.
 <ref>{{cite book |
 author=Richard Hartley and Andrew Zisserman |