Derivation of Calibration Matrix¶
This is a summary of the Lesson on Calibration Matrix of Udacity Computer Vision Course (https://www.udacity.com/course/introduction-to-computer-vision--ud810).
Geometric relationship between world coordinate system and camera coordinate system gives rise to the following equation (intrinsic relation):
$$\vec{p'} = K ~^C\vec{p}$$Geometric relationship between camera coordinate system and the coordinate system of image plane gives rise to the following equation (extrinsic relation):
$$~^C\vec{p} = \big ( ~^C_W\vec{R} \quad ~^C_W\vec{t} \big ) ~^W\vec{p}$$Combining extrinsic relation and intrinsic relation, we have:
$$\vec{p'} = M ~^W\vec{p}$$Calibration matrix $M$ relates the coordinates of a point in world system to the coordinates of a point on image plane. The dimension of $M$ is 12. Each pair of corresponding points imposes two constraint equations on the entries of $M$. This fact combined with the fact that the entries of M only matter up to a multiplying scalar implies that we need at least 6 pairs of corresponding points to estimate the entries of $M$. In practice, the more points we can have, the more precise $M$ can be estimated. Solving M amounts to solving the following homogeneous system of equations:
$$Ax = 0$$which only has non-trivial solutions when $A$ does not have full column rank. If this is the case, any eigenvector that is associated with the smallest eigenvalue of $A^T A$ is a solution.
Infer camera position via Calibration Matrix¶
import numpy as np
from matplotlib import pyplot as plt
import cv2
plt.figure(figsize = (10, 10))
point_2d = []
point_3d = []
# Construct the A matrix
with open('/home/andy/Desktop/computer_vision/ps3/input/pts2d-norm-pic_a.txt') as file:
for line in file:
point_2d.append([float(item.strip()) for item in line.split(' ') if item.strip() != ''])
with open('/home/andy/Desktop/computer_vision/ps3/input/pts3d-norm.txt') as file:
for line in file:
point_3d.append([float(item.strip()) for item in line.split(' ') if item.strip() != ''])
A = np.zeros((2*len(point_2d), 12))
for i in range(len(point_2d)):
temp = np.asarray(point_3d[i] + [1])
A[2*i] = np.append(temp, [(0, 0, 0, 0), -point_2d[i][0] * temp])
A[2*i+1] = np.append((0, 0, 0, 0), [temp, -point_2d[i][1] * temp])
# Find a non-trivial solution of Am = 0
U, s, V = np.linalg.svd(np.dot(np.transpose(A), A), full_matrices=True)
M = np.reshape(np.transpose(V)[:,-1], (3, 4))
# Imply the position of the camera from M
C = -np.dot(np.linalg.inv(M[:,0:3]), M[:,-1])
print 'Camera Position:', C