[CS5670] Lecture 6: Feature Descriptors and Feature Matching

티스토리 뷰

AI/Computer Vision

[CS5670] Lecture 6: Feature Descriptors and Feature Matching

미남잉 2022. 6. 23. 15:57

728x90

내용 출처: CS5670

이번 파트는 cs5670의 6번째 강의로 Feature Descriptors and Feature Matching의 주제입니다. 책으로는 4.1 파트에 해당됩니다.

Local features를 찾는데 1) Detection, 2) Description, 3) Matching이란 3단계가 있다면, 이제 2번째 단계입니다. 해당 파트는 PPT 자료에 나온 그대로 각각 흥미 있는 포인트 주위로 벡터의 특징을 추출하는 부분입니다.

Feature descriptors

저번 Lec5. 에서 해당 페이지를 마지막으로 끝이 났습니다.

We know how to detect good points

Next question: How to match them?

Answer: Come up with a descriptor for each point, find similar descriptors between the two images

각 지점에 대한 descriptor를 찾아내고, 두 이미지 사이의 비슷한 descriptors를 찾습니다.

이제 descriptors가 무엇인지 궁금하실 거라 생각합니다. descripot은 특정 object를 대표할 수 있는 특성이라 생각할 수 있습니다.

이 개념은 SIFT(Scale Invaraint Feature Transform) 알고리즘에 대해 설명할 때 보충하겠습니다.

Lots of possibilities

Simple option: match square windows around the point
State of the art approach: SIFT

Invariance vs. discriminability (중요✔)

Invariance: Descriptor shouldn’t change even if image is transformed
Discriminability: Descriptor should be highly unique for each point

descriptor는 이미지의 변형에 따라 변하지 않아야 하고, 그 특성을 invariance 라고 합니다. 또한, descriptor가 각 포인트에 따라 매우 특별해야 하는 특성을 Discriminability라고 합니다.

Image transformation revisited

Geometric: Rotation, Scale
Photo metric: Intensity change

Invariant descriptors

We looked at invariant /equivariant detectors
Most feature descriptors are also designed to be invaraint to:
- Translation, 2D rotation, scale
They can usually also handle
- Limited 3D rotations (SIFT works up to about 60 degrees)
- Limited affine transforms (some are fully affine invariant)
- Limited illumination /contrast changes

How to achieve invariance

Need both of the following:

Make sure your detector is invaraint
Design an invariant feature descriptor
- Simplest descriptor: a single 0
  - What’s this invaraint to?
- Next simplest descriptor: a square, axis-aligned 5x5 window of pixels
  - What’s this invarint to?
- Let’s look at some better approaches…

Rotation invariance for feature descriptors

Find dominant orientation of the image patch
- E.g., given by $x_{mean}$ the eigenvector of H corresponding to $\lambda_{max}$ (the larger eigenvalue)
- Or (better) simply the orientation of the (smoothed) gradient
- Rotate the patch according to this angle

Multiscale Oriented PatcheS descriptor

Take 40x40 square window around detected feature

Scale to 1/5 size (using prefiltering)
Rotate to horizontal
Sample 8x8 square window centered at feature
Intensity normalize the window by subtracting the mean, dividing by the standard deviation in the window (why?)

Detections at multiple scales

Scale Invariant Feature Transform

Basic idea:
• Take 16x16 square window around detected feature
• Compute edge orientation (angle of the gradient - 90°) for each pixel
• Throw out weak edges (threshold gradient magnitude)
• Create histogram of surviving edge orientations

SIFT descriptor

Full version • Divide the 16x16 window into a 4x4 grid of cells (2x2 case shown below) • Compute an orientation histogram for each cell • 16 cells * 8 orientations = 128 dimensional descriptor

Properties of SIFT

Extraordinarily robust matching technique

Can handle changes in viewpoint (up to about 60 degree out of plane rotation)
Can handle significant changes in illumination (sometimes even day vs. night (below))
Pretty fast—hard to make real-time, but can run in <1s for moderate image sizes
Lots of code available

SIFT Example

Other descriptors

HOG: Histogram of Gradient (HOG)
- Dalal/Triggs
- Sliding window, pedestrain detection
FREAK: Fast Retina Keypoint
- Perceptually motivated
- Can run in real-time; used in Visual SLAM on-device
LIFT: Learn invariant Feature Transform
- Learned via deep learning - along with many other recent features

Summary

keypoint detection: repeatable and distinctive
- Corners, blobs, stable regions
- Harris, DoG

Descriptors: robust and selective
- spatial historgrams of orientation
- SIFT and variants are typically good for stiching and recognition
- But, need not stick to one

Which features match?

Feature matching

Given a feature in $I_1$, how to find the best match in $I_2$?

Define distance function that compares two descriptors
Test all the features in $I_2$, find the one with min distance

Feature distance

How to define the difference between two features $f_1$, $f_2$?

Simple approach: $L_2$ distance, ||$f_1$-$f_2$||
can give small distances for ambiguous (incorrect) matches

How to define the difference between two features $f_1$, $f_2$?

Better approach: ratio distance = ||$f_1-f_2$|| / || $f_1-f_2$’||
- $f_2$ is the best SSD match to $f_1$ in $I_2$
- $f_2$’ is the 2nd best SSD match to $f_1$ in $I_2$
- gives large values for ambiguous matches

Does the SSD vs “ratio distance” change the best match to a given feature in image 1?

Feature matching example

Evaluating the results

How can we measure the performance of a feature matcher?

True/false positives

The distance threshold affects performance

True positives(TP) = # of detected matches that survive the threshold that are correct
False positives(FP) = # of detected matches that survive the threshold that are incorrect

Suppose we want to maximize true positives. How do we set the threshold? (We keep all matches with distance below the threshold.)
Suppose we want to minimize false positives. How do we set the threshold? (We keep all matches with distance below the threshold.)

Evaluate the results

How can we measure the performance of a feature matcher?

Single number: Area Under the Curve (AUC)
E.g. AUC = 0.87
1 is the best

ROC curves - summary

By thresholding the match distances at different thresholds, we can generate sets of matches with different true/false positive rates
ROC curve is generated by computing rates at a set of threshold values swept through the full range of possible threshold
Area under the ROC curve (AUC) summarizes the performance of a feature pipeline (higher AUC is better)

Lots of applications

Feature are used for:

Image alignment (e.g., mosaics)
3D reconstruction
Motion tracking
Object recognition
Indexing and database retrieval
Robot navigation
… other

Object recognition (David Lowe)

3D Reconstruction

Augmented Reality

728x90

저작자표시 비영리 변경금지

'AI > Computer Vision' 카테고리의 다른 글

[보충] Affine transformations and Homography (0)	2022.07.14
[CS5670] Lecture 7: Transformations and warping (0)	2022.06.24
[CS5670] Lecture 5: Feature Invariance (0)	2022.06.23
Aliasing(엘리어싱) - 발생 이유, 결과, 방지 방법 (1)	2022.06.01
[CS5670] Lecture 4: Local features & Harris corner detection (0)	2022.05.23

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

글 보관함

250x250

이빨 빠진 미남이

티스토리 뷰

[CS5670] Lecture 6: Feature Descriptors and Feature Matching

Feature descriptors

Invariance vs. discriminability (중요✔)

Image transformation revisited

Invariant descriptors

How to achieve invariance

Rotation invariance for feature descriptors

Multiscale Oriented PatcheS descriptor

Detections at multiple scales

Scale Invariant Feature Transform

SIFT descriptor

Properties of SIFT

SIFT Example

Other descriptors

Summary

Which features match?

Feature matching

Feature distance

Feature matching example

Evaluating the results

True/false positives

Evaluate the results

ROC curves - summary

Lots of applications

Object recognition (David Lowe)

3D Reconstruction

Augmented Reality

'AI > Computer Vision' 카테고리의 다른 글

티스토리툴바