Characterization of Harris Corner¶
A good way to determine whether there is an edge at the vicinity of $(x,y)$ is to measure how much the intensity of the image changes around $(x,y)$. When we shift the image in horizontal direction and/or vertical direction, we expect the change in the intensity to be large at the location of an corner and vice versa. This motivates the definition of $E(x,y)$, which is a measure of how the intensity of an image patch around $(x,y)$ changes when we shift it on both directions:
$$ E(u,v) = \sum_{x,y} w(x,y) \, \big[ I(x+u,y+v) - I(x,y)\big]^2 $$Assuming that we are only interested in small shift around $(x,y)$, $E(u,v)$ can be approximated using its Taylor expansion up to the second order. The result is that:
$$E(u,v) \approx [u \text{ }v] M \bigg[ \begin{matrix} u \\ v \end{matrix} \bigg]$$in which
$$M = \sum_{x,y} w(x,y)\bigg[ \begin{matrix} I_x^2 & I_x I_y\\ I_x I_y & I_y^2 \end{matrix} \bigg]$$If there is a corner at $(x,y)$, we expect $E(u,v)$ to change a lot even with small shifts in $u$ and/or $v$. This behavior can be characterized by the two eigenvalues of M. Specifically, when the two eigenvalues of M are equally big, it's likely that there is a corner at $(x,y)$. When one eigenvalue is much bigger than the other, it's likely that it is an edge. When both of them are small, it's likely that it is just a plat region. These three cases can be succinctly described by the response function:
$$R = \lambda_1\lambda_2 - \alpha (\lambda_1 + \lambda_2)^2$$An implementation of Harris detector¶
import numpy as np
from matplotlib import pyplot as plt
import cv2
def compute_harris_descriptors(image):
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
sift = cv2.xfeatures2d.SIFT_create()
window = 5
alpha = 0.06
kps = []
thresh = 10000000000000
# compute Harris response
I_x = cv2.Sobel(gray, cv2.CV_32F, 1, 0, ksize=5)
I_y = cv2.Sobel(gray, cv2.CV_32F, 0, 1, ksize=5)
I_xx = cv2.GaussianBlur(I_x*I_x, (window,window), 1)
I_yy = cv2.GaussianBlur(I_y*I_y, (window,window), 1)
I_xy = cv2.GaussianBlur(I_x*I_y, (window,window), 1)
M_det = I_xx*I_yy - I_xy**2
M_tra = I_xx + I_yy
resp = M_det - alpha*np.square(M_tra)
# compute Harris corners
I_angle = np.arctan2(I_y,I_x)
for row in range(resp.shape[0]):
for col in range(resp.shape[1]):
if resp[row,col] > thresh:
is_max = True
for h in range(-(window/2), (window/2) + 1):
for v in range(-(window/2), (window/2) + 1):
if is_max:
new_h = row + h
new_v = col + v
if 0 <= new_h < image.shape[0] and 0 <= new_v < image.shape[1]:
if resp[row,col] < resp[new_h,new_v]:
is_max = False
else:
break
if is_max:
point = cv2.KeyPoint(x=col, y=row, _size=3, _angle=I_angle[row,col])
kps.append(point)
# compute SIFT descriptors
kps, descriptors = sift.compute(image, kps)
return kps, descriptors
Match two images using Harris detectors¶
plt.figure(figsize = (10, 8))
# find keypoints and associated descriptors
image_a = cv2.imread('images/simA.jpg')
kp_a, d_a = compute_harris_descriptors(image_a)
#cv2.drawKeypoints(image_a, kp_a, image_a)
#plt.subplot(121), plt.imshow(image_a, cmap='gray')
image_b = cv2.imread('images/simB.jpg')
kp_b, d_b = compute_harris_descriptors(image_b)
#cv2.drawKeypoints(image_b, kp_a, image_b)
#plt.subplot(122), plt.imshow(image_b, cmap='gray')
# find matches and draw correspondent lines
bfm = cv2.BFMatcher()
matches = bfm.match(d_a, d_b)
matches = sorted(matches, key = lambda x:x.distance)
image_result = np.zeros((0,0))
image_result = cv2.drawMatches(image_a, kp_a, image_b, kp_b, matches, image_result, flags=2)
plt.subplot(212), plt.imshow(image_result, cmap='gray')
plt.show()
RANSAC Algorithm to Remove Bad Matches¶
It can be seen that the raw matches provided by the SIFT description is rather good and there are still few bad matches here and there. The aim of RANSAC is to get rid of these bad matches. The basic idea of RANSAC is that good matches will produce a transform with which many other matches agree with. In this case, we assume that there is a homography that transform one image to the other (a more accurate choice should be the use of the fundamental matrix instead). The homography which has most matches support it is most likely to be the actual transform.
The number of iteration $N$ for RANSAC to have the probability of having outliers in all of the N iterations to be less than $1-p$ can be proved to be:
$$N > \frac{log(1-p)}{log(1-(1-e)^s)}$$in which $e$ is the percentage of outliers and $s$ the number of points to construct the model.
An implementation of RANSAC¶
N = 100000000000
e = 1.0
s = 4
p = 0.95
sample_count = 0
max_inliers_cnt = -1
max_inliers_set = []
while (N > sample_count):
# choose a sample
sample = np.random.choice(matches, size=s, replace=True)
# count number of inliers
src_points = np.asarray([kp_a[match.queryIdx].pt for match in sample])
dst_points = np.asarray([kp_b[match.trainIdx].pt for match in sample])
homo, status = cv2.findHomography(src_points, dst_points)
inliers_cnt = s
inliers_set = []
for match in matches:
src_point = kp_a[match.queryIdx].pt
dst_point = kp_b[match.trainIdx].pt
est_point = np.dot(homo, np.append(src_point, 1))
est_point = est_point[:2]/est_point[2]
distance = (dst_point - est_point)**2
if distance.sum() < 4:
inliers_cnt += 1
inliers_set.append(match)
# update N if e_0 < e
if inliers_cnt > max_inliers_cnt:
max_inliers_cnt = inliers_cnt
max_inliers_set = inliers_set
e_0 = 1 - float(inliers_cnt)/len(matches)
if e_0 < e:
e = e_0
N = int(np.log(1 - p)/np.log(1 - (1-e)**s))
sample_count += 1
src_points = np.asarray([kp_a[match.queryIdx].pt for match in max_inliers_set])
dst_points = np.asarray([kp_b[match.trainIdx].pt for match in max_inliers_set])
homo, status = cv2.findHomography(src_points, dst_points)
Remove all bad matches using RANSAC¶
plt.figure(figsize = (10, 10))
# draw inlier lines of correspondence
image_result = np.zeros((0,0))
image_result = cv2.drawMatches(image_a, kp_a, image_b, kp_b, max_inliers_set, image_result, flags=2)
plt.subplot(211), plt.imshow(image_result, cmap='gray')
# transform and register two images
warp_image = cv2.warpPerspective(image_b, np.linalg.inv(homo), (740, 530))
for row in range(image_a.shape[0]):
for col in range(image_a.shape[1]):
warp_image[row,col] = image_a[row,col]
plt.subplot(212), plt.imshow(warp_image, cmap='gray')
plt.show()