Home
  • Home
  • Categories
  • Tags
  • Archives

Corner Detection

Characterization of Harris Corner¶

A good way to determine whether there is an edge at the vicinity of $(x,y)$ is to measure how much the intensity of the image changes around $(x,y)$. When we shift the image in horizontal direction and/or vertical direction, we expect the change in the intensity to be large at the location of an corner and vice versa. This motivates the definition of $E(x,y)$, which is a measure of how the intensity of an image patch around $(x,y)$ changes when we shift it on both directions:

$$ E(u,v) = \sum_{x,y} w(x,y) \, \big[ I(x+u,y+v) - I(x,y)\big]^2 $$

Assuming that we are only interested in small shift around $(x,y)$, $E(u,v)$ can be approximated using its Taylor expansion up to the second order. The result is that:

$$E(u,v) \approx [u \text{ }v] M \bigg[ \begin{matrix} u \\ v \end{matrix} \bigg]$$

in which

$$M = \sum_{x,y} w(x,y)\bigg[ \begin{matrix} I_x^2 & I_x I_y\\ I_x I_y & I_y^2 \end{matrix} \bigg]$$

If there is a corner at $(x,y)$, we expect $E(u,v)$ to change a lot even with small shifts in $u$ and/or $v$. This behavior can be characterized by the two eigenvalues of M. Specifically, when the two eigenvalues of M are equally big, it's likely that there is a corner at $(x,y)$. When one eigenvalue is much bigger than the other, it's likely that it is an edge. When both of them are small, it's likely that it is just a plat region. These three cases can be succinctly described by the response function:

$$R = \lambda_1\lambda_2 - \alpha (\lambda_1 + \lambda_2)^2$$

An implementation of Harris detector¶

import numpy as np
from matplotlib import pyplot as plt
import cv2

def compute_harris_descriptors(image):
    gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
    sift = cv2.xfeatures2d.SIFT_create()
    window = 5
    alpha  = 0.06
    kps    = []
    thresh = 10000000000000
    
    # compute Harris response
    I_x = cv2.Sobel(gray, cv2.CV_32F, 1, 0, ksize=5)
    I_y = cv2.Sobel(gray, cv2.CV_32F, 0, 1, ksize=5)
    I_xx = cv2.GaussianBlur(I_x*I_x, (window,window), 1) 
    I_yy = cv2.GaussianBlur(I_y*I_y, (window,window), 1) 
    I_xy = cv2.GaussianBlur(I_x*I_y, (window,window), 1) 
    M_det = I_xx*I_yy - I_xy**2
    M_tra = I_xx + I_yy
    
    resp = M_det - alpha*np.square(M_tra)
    
    # compute Harris corners
    I_angle = np.arctan2(I_y,I_x)
    
    for row in range(resp.shape[0]):
        for col in range(resp.shape[1]):
            if resp[row,col] > thresh:
                is_max = True

                for h in range(-(window/2), (window/2) + 1):
                    for v in range(-(window/2), (window/2) + 1):
                        if is_max:
                            new_h = row + h
                            new_v = col + v

                            if 0 <= new_h < image.shape[0] and 0 <= new_v < image.shape[1]:
                                if resp[row,col] < resp[new_h,new_v]:
                                    is_max = False
                        else:
                            break

                if is_max:
                    point = cv2.KeyPoint(x=col, y=row, _size=3, _angle=I_angle[row,col])
                    kps.append(point)
    
    # compute SIFT descriptors
    kps, descriptors = sift.compute(image, kps)
    
    return kps, descriptors

Match two images using Harris detectors¶

plt.figure(figsize = (10, 8))

# find keypoints and associated descriptors
image_a = cv2.imread('images/simA.jpg')
kp_a, d_a  = compute_harris_descriptors(image_a)
#cv2.drawKeypoints(image_a, kp_a, image_a)
#plt.subplot(121), plt.imshow(image_a, cmap='gray')

image_b = cv2.imread('images/simB.jpg')
kp_b, d_b  = compute_harris_descriptors(image_b)
#cv2.drawKeypoints(image_b, kp_a, image_b)
#plt.subplot(122), plt.imshow(image_b, cmap='gray')

# find matches and draw correspondent lines
bfm = cv2.BFMatcher()

matches = bfm.match(d_a, d_b)
matches = sorted(matches, key = lambda x:x.distance)

image_result = np.zeros((0,0))
image_result = cv2.drawMatches(image_a, kp_a, image_b, kp_b, matches, image_result, flags=2)
plt.subplot(212), plt.imshow(image_result, cmap='gray')

plt.show()

RANSAC Algorithm to Remove Bad Matches¶

It can be seen that the raw matches provided by the SIFT description is rather good and there are still few bad matches here and there. The aim of RANSAC is to get rid of these bad matches. The basic idea of RANSAC is that good matches will produce a transform with which many other matches agree with. In this case, we assume that there is a homography that transform one image to the other (a more accurate choice should be the use of the fundamental matrix instead). The homography which has most matches support it is most likely to be the actual transform.

The number of iteration $N$ for RANSAC to have the probability of having outliers in all of the N iterations to be less than $1-p$ can be proved to be:

$$N > \frac{log(1-p)}{log(1-(1-e)^s)}$$

in which $e$ is the percentage of outliers and $s$ the number of points to construct the model.

An implementation of RANSAC¶

N = 100000000000
e = 1.0
s = 4
p = 0.95

sample_count = 0
max_inliers_cnt = -1
max_inliers_set = []

while (N > sample_count):
    # choose a sample
    sample = np.random.choice(matches, size=s, replace=True)
    
    # count number of inliers
    src_points = np.asarray([kp_a[match.queryIdx].pt for match in sample])
    dst_points = np.asarray([kp_b[match.trainIdx].pt for match in sample])
    
    homo, status = cv2.findHomography(src_points, dst_points)
    
    inliers_cnt = s
    inliers_set = []
    
    for match in matches:
        src_point = kp_a[match.queryIdx].pt
        dst_point = kp_b[match.trainIdx].pt
        
        est_point = np.dot(homo, np.append(src_point, 1))
        est_point = est_point[:2]/est_point[2]
        
        distance = (dst_point - est_point)**2
        
        if distance.sum() < 4:
            inliers_cnt += 1
            inliers_set.append(match)
    
    # update N if e_0 < e
    if inliers_cnt > max_inliers_cnt:
        max_inliers_cnt = inliers_cnt
        max_inliers_set = inliers_set
        
        e_0 = 1 - float(inliers_cnt)/len(matches)

        if e_0 < e:
            e = e_0
            N = int(np.log(1 - p)/np.log(1 - (1-e)**s))

    sample_count += 1

src_points = np.asarray([kp_a[match.queryIdx].pt for match in max_inliers_set])
dst_points = np.asarray([kp_b[match.trainIdx].pt for match in max_inliers_set])

homo, status = cv2.findHomography(src_points, dst_points)

Remove all bad matches using RANSAC¶

plt.figure(figsize = (10, 10))

# draw inlier lines of correspondence
image_result = np.zeros((0,0))
image_result = cv2.drawMatches(image_a, kp_a, image_b, kp_b, max_inliers_set, image_result, flags=2)
plt.subplot(211), plt.imshow(image_result, cmap='gray')

# transform and register two images
warp_image = cv2.warpPerspective(image_b, np.linalg.inv(homo), (740, 530))
for row in range(image_a.shape[0]):
    for col in range(image_a.shape[1]):
        warp_image[row,col] = image_a[row,col]
plt.subplot(212), plt.imshow(warp_image, cmap='gray')

plt.show()
Comments
comments powered by Disqus

  • « Projective Geometry
  • How Backpropagation Work? »

Published

Jan 14, 2017

Category

Computer Vision

Tags

  • Analysis 3
  • Linear Algebra 6
  • Probability 2
  • Powered by Pelican. Theme: Elegant by Talha Mansoor