Stereo Vision concepts

1. Camera Parameters

They include: the intrinsic parameters and the extrinsic parameters.

Calibration techniques typically calculate both the intrinsic and extrinsic parameters that reduce the error between the 3D scene and its projections.

Each parameter is fundamental to compute the projection of a scene factor onto a picture point.

The intrinsic parameters can be determined a priori.

The extrinsic parameters change because of the movement.

1.1 Intrinsic Parameters

The pinhole camera model is the simplest and most widely used model for digital cameras. It offers the relationship between world points and their projection into a 2D photo plane.

Figure: Pinhole camera

Considering a 3D point P = (X, Y, Z), and using the properties of similar triangles, the projection of P on the image plane is the point p = (𝑥0, 𝑦0), where 𝑥0 and 𝑦0 are given by

A single 3D point has a unique projection in the image plane, however all 3D points belonging to the green line in Figure are possible correspondences for a specific pixel (𝑥0, 𝑦0). So The inverse mapping is no longer unique and It is not feasible to retrieve depth data from a single photo.

Figure: The Central-Projection Model

The image plane is located at a distance f from the camera’s reference point. The camera’s coordinate frame is righthanded with the z-axis defining the depth.

The relationships for a pinhole camera model can be expressed in matrix format as follows:

where:

- 𝑘0 is the matrix of the intrinsic parameters. This is an approximation. In its actual form, 𝑘0 incorporates five intrinsic parameters (𝑓𝑦, 𝑓𝑥, 𝑠, cx and cy)

- parameters cx and cy are the offsets of the principal point in the image. They are typically placed at the center of the picture. The principal points are the points where the principal planes cross the optical axis.

- focal length (f) is divided into two parameters fx and fy.

- s is known as the axis skew which is used to represent the distortion that can result from pixels that don’t have a perfectly rectangular shape

Figure Skew effect

1.2 Extrinsic Parameters

The extrinsic parameters are to map 3D points between the desired coordinate system and the coordinate system of the camera.

They are consisted of two major transformations rotation and translation. The best way to express these transformations is by using a matrix form:

Figure: Camera Coordinate System Vs Arbitrary Coordinate System

There are 12 extrinsic parameters, the rotation matrix and translation matrix.

The final equation includes intrinsic and extrinsic:

1.3 Camera Calibration

Camera calibration is a method for finding the intrinsic and the extrinsic parameters of a camera. Stereo rectification generally depends on the information of these parameters.

The calibration process requires to take several pictures from distinctive distances and angles to achieve accurate results for the camera parameters.

1.4 Lens Distortion

We have assumed that the camera has perfect lenses. Real lenses suffer from many kinds of distortion, the first one is called “radial distortion” where the camera produces distorted pictures, and this leads to incorrect matches particularly in the outer areas of the picture.

It is stated that a normal 0.5-1% distortion is equal to a positional error of 1.25 to 2.5 pixels of the picture.

1.5 Epipolar Geometry

1.6 Image Rectification

The image rectification is a transformation method used to project images onto a common image plane. It is used in computer stereo vision to solve the correspondence problem.

Rectification transforms the pictures so that the epipolar traces are aligned horizontally, by using the homograph matrix H to every image point.

Figure: Rectified Stereo Image

1.7 Stereo Vision Flowchart

Stereo vision adds the perception of the depth dimension, which is very significant in various applications. Stereo vision has many advantages, the most important one is the extraction of 3D definition from digital images.