Prof. Sebastiano Battiato http://www.dmi.unict.it/~battiato/ - battiato@dmi.unict.it

In this practical session we will learn:

- How to calibrate a camera;
- How to undistort images using the intrinsic parameters:
- How to draw 3D objects on images using the extrinsic parameters;
- How to calibrate a stereo system;
- How to rectify a stereo image pair.

The goal of this practical session is to guide the reader through the understanding of theoretical and practical concepts. To guide the understanding, the reader will be asked to answer some questions and do some exercises.

This icon indicates when the reader is asked to answer a question. |

This icon indicates when the reader is asked to do an exercise. |

We will use the data included in the OpenCV samples, which are included in the source code of OpenCV. Specifically we will need the `left01.jpg -- left14.jpg`

and `right01.jpg -- right14.jpg`

image series located in the `samples/cpp/`

directory. For convenience, you can also download the data here. The images have been acquired using a stereo camera so that each <`leftxx.jpg`

-`rightxx.jpg`

> image pair represents the same scene. To start working with the images, extract the archive in the current directory.

Let's start loading and visualizing two corresponding images:

In [1]:

```
import cv2
im_left = cv2.imread('left01.jpg')
im_right = cv2.imread('right01.jpg')
```

Even if the images are actually grayscale, by default they are loaded as color images:

In [2]:

```
print im_left.shape
print im_right.shape
```

We can now show the two images as follows:

In [3]:

```
from matplotlib import pyplot as plt
plt.subplot(121)
plt.imshow(im_left[...,::-1])
plt.subplot(122)
plt.imshow(im_right[...,::-1])
plt.show()
```

Question 1.1 The two images have been acquired using a stereo camera. What can we say about the baseline of the stereo camera? Is it small? Is it large? |

Question 1.2 Each image of the dataset contains the same pattern (a checkerboard) seen from different points of view. Why is this data convenient for camera calibration? Why the pattern has to be acquired from different viewpoints? |

Calibrating the camera basically means finding the two matrices:

\begin{eqnarray} M_{int}=\left( \begin{array}{ccc} f_x & 0 & o_x \\ 0 & f_y & o_y \\ 0 & 0 & 1 \end{array} \right), M_{ext}=\left( \begin{array}{cccc} r_{11} & r_{12} & r_{13} & t_{1} \\ r_{21} & r_{22} & r_{23} & t_{2}\\ r_{31} & r_{32} & r_{33} & t_{3}\end{array} \right) \end{eqnarray}
Question 2.1 How many parameters do we need to find? Why do we have two matrices? What is the meaning of each parameter? How can we "use" the two matrices once the camera has been calibrated? |

Points can be easily found using the cv2.findChessboardCorners function:

In [4]:

```
ret, corners = cv2.findChessboardCorners(im_left, (7,6))
```

`(7,6)`

as `patternSize`

parameter to specify the number of **inner** corners in the checkerboard. In practice, we are excluding incomplete rows and columns, as well as the first and last complete rows and columns in the checkerboard. This is done to obtain a more reliable detection.

`ret`

value which is set to `True`

if the complete checkerboard was detected and `False`

otherwise. The function also returns the list of the coordinates of detected corners **in the image plane**:

In [5]:

```
print corners.shape
```

`corners`

variable contains $42$ vectors of dimension $1\times2$. Each vectors represents the coordinates of a detected corner. For example we can see the values of the first corner as follows:

In [6]:

```
print corners[0]
```

`corners[0]`

is an array containing an array of two values. This explicit shaping is probably derived from the C++ OpenCV api. To handle the `corners`

array more easily, we can reshape it as follows:

In [7]:

```
corners=corners.reshape(-1,2)
print corners.shape
print corners[0]
```

`img1`

, a good idea would be to create a copy for visualization only:

In [8]:

```
im_left_vis=im_left.copy()
cv2.drawChessboardCorners(im_left_vis, (7,6), corners, ret)
plt.imshow(im_left_vis)
plt.show()
```

`ret`

value to notify the `drawChessboardCorners`

function if the checkerboard was completely or only partially detected. Moreover, as can be observed from the image, the $42$ corners are detected from left to right, top to bottom.

Question 2.2 Could we choose another coordinate system? Why? |

Therefore, the X and Y coordinates will vary, while the Z coordinate will be a constant zero.

```
[0,0,0]
[1,0,0]
[2,0,0]
...
[0,1,0]
[1,1,0]
...
[6,5,0]
```

In [9]:

```
import numpy as np
x,y=np.meshgrid(range(7),range(6))
print "x:\n",x
print "y:\n",y
```

Basically, every pair of values $(x_{ij},y_{ij})$ represent a point in the X-Y space.

`x`

and `y`

to obtain column vectors. Then, we stack the vectors vertically and add a vector with 42 zeros. We finally convert the array into an array of floats:

In [10]:

```
world_points=np.hstack((x.reshape(42,1),y.reshape(42,1),np.zeros((42,1)))).astype(np.float32)
print world_points
```

`world_points`

corresponds to a row of `corners`

. We can show some of these correspondences:

In [11]:

```
print corners[0],'->',world_points[0]
print corners[35],'->',world_points[35]
```

In [12]:

```
from glob import glob
_3d_points=[]
_2d_points=[]
img_paths=glob('*.jpg') #get paths of all all images
for path in img_paths:
im=cv2.imread(path)
ret, corners = cv2.findChessboardCorners(im, (7,6))
if ret: #add points only if checkerboard was correctly detected:
_2d_points.append(corners) #append current 2D points
_3d_points.append(world_points) #3D points are always the same
```

To calibrate the camera, we can use the cv2.calibrateCamera function:

In [13]:

```
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(_3d_points, _2d_points, (im.shape[1],im.shape[0]))
print "Ret:",ret
print "Mtx:",mtx," ----------------------------------> [",mtx.shape,"]"
print "Dist:",dist," ----------> [",dist.shape,"]"
print "rvecs:",rvecs," --------------------------------------------------------> [",rvecs[0].shape,"]"
print "tvecs:",tvecs," -------------------------------------------------------> [",tvecs[0].shape,"]"
```

`shape`

attribute contains the number of **rows** and **columns**, the two numbers need to be inverted (rows=height, columns=width), i.e., using `(im1.shape[1],im1.shape[0])`

rather than `(im1.shape[0],im1.shape[1])`

.

According to the documentation, The function returns the following values:

`ret`

: the mean reprojection error (it should be as close to zero as possible);`mtx`

: the matrix of intrisic parameters;`dist`

: the distortion parameters;`rvecs`

: the rotation vectors (one per image);`tvecs`

: the translation vectors (one per image).

Notes:

- Distortion parameters allow to define a model to explicitly remove distortion from images (see http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_calib3d/py_calibration/py_calibration.html#calibration);
- The rotation vector is a convenient way to represent a $3\times3$ rotation matrix (which has only 3 Degrees Of Freedom) and can be easily converted back to matrix using the cv2.Rodrigues function;

Excercise 2.1 The output of the cv2.calibrateCamera allows to obtain both the matrix of intrisic and extrinsic parameters. Define two variables `Mint` and `Mext` containing the two matrices and use them to project a point in the 3D space (i.e., a row of "world_points") to the image plane. Is the result close to the "ground truth" value contained in "corners"? |

While cameras are designed to adhere to the pinhole camera model, in practice, real camaras tend to deviate from it. The effect of such deviation is that some parts of the image tend to look "distorted", e.g., lines which we know should be straight, do not look so straing. For instance, let's visualize image `left12.jpg`

:

In [14]:

```
plt.imshow(cv2.imread('left12.jpg')[...,::-1])
plt.show()
```

Images tend to be affected by two different "kinds" of distortion:

- Radial distortion: lines far from the principal point look distorted;
- Tangential distortion: occurring becouse the lens is not perfectly parallel to the camera plane.

**Radial distortion** is modeled using the following relationship between the undistorted coordinates $x_{u}$ and $y_{u}$ and the distorted ones $x$ and $y$:

\begin{equation}
x_{u} = x(1+k_1r^2+k_2r^4+k_3r^6) \\
y_{u} = y(1+k_1r^2+k_2r^4+k_3r^6)
\end{equation}

*{x})^2+(y-o*{y})^2
\end{equation}

**Tangential distortion** is modeled using the following relationship:

\begin{equation}
x_u = x+[2p_1 xy+p_2(r^2+2x^2)\\
y_u = y+[p_1(r^2+2y^2)+2p_2xy)
\end{equation}

*cv2.calibrateCamera* in the *dist* variable.

The image can be rectified using the cv2.undistort function:

In [15]:

```
im=cv2.imread('left12.jpg')[...,::-1]
im_undistorted=cv2.undistort(im, mtx, dist)
plt.subplot(121)
plt.imshow(im)
plt.subplot(122)
plt.imshow(im_undistorted)
plt.show()
```

Question 2.3 Why the rectification process is important? What is the advantage of working on undistorted images? |

Question 2.4 Suppose that a new image is acquired using the same camera we just calibrated. What steps are needed to rectify the image? Do we need the image to contain a checkerboard? |

Now that the camera is calibrated for both the extrinsic and intrinsic parameters, we can project points from the 3D world to the 2D image plane. This can be used, for instance, to implement "augmented reality" algorithms which draw 3D objects on the image. Let's see how to draw a cube on the checkerboard. First, define the $8$ corners of a cube of side 3:

In [16]:

```
_3d_corners = np.float32([[0,0,0], [0,3,0], [3,3,0], [3,0,0],
[0,0,-3],[0,3,-3],[3,3,-3],[3,0,-3]])
```

In [17]:

```
image_index=7
cube_corners_2d,_ = cv2.projectPoints(_3d_corners,rvecs[image_index],tvecs[image_index],mtx,dist)
#the underscore allows to discard the second output parameter (see doc)
print cube_corners_2d.shape #the output consists in 8 2-dimensional points
```

We can now plot limes on the 3D image using the cv2.line function:

In [18]:

```
img=cv2.imread(img_paths[image_index]) #load the correct image
red=(0,0,255) #red (in BGR)
blue=(255,0,0) #blue (in BGR)
green=(0,255,0) #green (in BGR)
line_width=5
#first draw the base in red
cv2.line(img, tuple(cube_corners_2d[0][0]), tuple(cube_corners_2d[1][0]),red,line_width)
cv2.line(img, tuple(cube_corners_2d[1][0]), tuple(cube_corners_2d[2][0]),red,line_width)
cv2.line(img, tuple(cube_corners_2d[2][0]), tuple(cube_corners_2d[3][0]),red,line_width)
cv2.line(img, tuple(cube_corners_2d[3][0]), tuple(cube_corners_2d[0][0]),red,line_width)
#now draw the pillars
cv2.line(img, tuple(cube_corners_2d[0][0]), tuple(cube_corners_2d[4][0]),blue,line_width)
cv2.line(img, tuple(cube_corners_2d[1][0]), tuple(cube_corners_2d[5][0]),blue,line_width)
cv2.line(img, tuple(cube_corners_2d[2][0]), tuple(cube_corners_2d[6][0]),blue,line_width)
cv2.line(img, tuple(cube_corners_2d[3][0]), tuple(cube_corners_2d[7][0]),blue,line_width)
#finally draw the top
cv2.line(img, tuple(cube_corners_2d[4][0]), tuple(cube_corners_2d[5][0]),green,line_width)
cv2.line(img, tuple(cube_corners_2d[5][0]), tuple(cube_corners_2d[6][0]),green,line_width)
cv2.line(img, tuple(cube_corners_2d[6][0]), tuple(cube_corners_2d[7][0]),green,line_width)
cv2.line(img, tuple(cube_corners_2d[7][0]), tuple(cube_corners_2d[4][0]),green,line_width)
#cv2.line(img, tuple(start_point), tuple(end_point),(0,0,255),3) #we set the color to red (in BGR) and line width to 3
plt.imshow(img[...,::-1])
plt.show()
```