[CS5670] Lecture 18: Structure from motion

티스토리 뷰

AI/Computer Vision

[CS5670] Lecture 18: Structure from motion

Suyeon Cha 2022. 8. 27. 21:44

728x90

자료 출처: CS5670 Structure from motion

Structure from motion

해당 개념은 2차원 영상으로부터 3차원 정보를 추출하는 것을 의미합니다. 순차적인 이미지 세트로부터 Structure의 3D reconstruction을 진행합니다.

― Multi-view stereo는 camera가 calibrated 되었다고 가정합니다.

― 이 말의 의미는 모든 view에 대해서 Extrinsics와 intrinsics를 알고 있다는 의미입니다. 그 내/외부 파라미터를 구하는 과정을 calibration이라고 합니다.

― 만약 그것을 모를 때 어떻게 calibration을 연산할 수 있을까요? 일반적으로 이것을 strucrture from motion이라 부릅니다.

Large-scale structure from motion

Two-views

― Fundmental matrix와 Essential matrix를 풉니다.

― intrinsics, rotation, translation을 factorize합니다.

What about more than two views?

― 3개 view의 geometry는 trifocal tensor라고 불리는 3x3x3 tensor로 표현됩니다.

― 4개 view의 geometry는 quadrifocal tensor라 불리는 3x3x3x3 tensor로 표현됩니다.

― 이후 복잡해지기 시작합니다.

― 대신에, 우리는 camera pose와 scene geometry를 명시적으로 풉니다.

Structure from motion

많은 이미지가 주어졌을 때 어떻게 할 수 있나요?

― (a) 어디에서 가져왔는지 알아내는 것

― (b) scene의 3D model을 설계

합니다.

이것을 structure from motion (SfM) problem이라 합니다. Video로부터 camera poses를 계산할 수 있습니다. (Visual SLAM)

What we've seen so far

이때까지 우리가 봐온 것은

― 이미지들끼리의 2D transformation

→ Translations, affine transformations, homographies

― Fundamental matrices

→ 2D lines을 일치하게 하는 형태의 2D images끼리의 관계를 봤습니다.

― What's new?

그렇다면 새롭게 보게 될 것은 무엇인가요?

cameras와 points의 3D geometry를 표현하는 것을 볼 예정입니다.

Input

Triangulation & camera calibration

우리는 point로부터 관찰된 각각의 camera parameters를 안다고 가정합니다.

― 그점들로부터 어떻게 3D location을 연산하는지?

― 이것은 triangulation이라 부릅니다. (삼각측량법)

반면에 3D point를 알고 있다고 가정합니다.

― 이 points과 image 간 match합니다.

― camera parameters를 어떻게 연산합니까?

― 이것을 camera calibration, camera resectioning이라 부릅니다.

결국은 triangulation이나 camera calibration 다 배웠던 내용인데 SFM은 3D point 또는 camera parameter를 모른다는 가정하에, 이 두 문제를 한 번에 푸는 방법입니다.

이것은 닭이냐 달걀이냐...의 문제라고 하네요.

First step: how to get correspondence?

Feature detection and matching

SIFT를 통해 feature를 detect할 수 있습니다.

각 쌍들에 대한 fundamental matrix를 추정하기 위해 RANSAC을 통해서 matching을 정제합니다.

Image connectivity graph

Correspondence estimation

여러 장의 이미지에 연결된 요소인 match들을 쌍으로 연결합니다.

Input to Structure from Motion

Structure from motion

Problem size

변수: Cameras, points
카메라마다 가진 변수의 개수: 6개 (if calibrated), 더 많음(if uncalibrated)
점마다 가진 변수의 개수: 3개
Trevi Fountain collection에는 466개의 input photo, $100,000$개가 넘는 3D points 존재 → 최적화 문제가 엄청 커집니다.

Structure from motion

Minimize sum of squared reprojection errors:

Minimizing this function is called bundle adjustment
- Optimized using non-linear least squares, e.g Levenberge-Marquardt algorithm

SfM은 항상 독특하게 풀 수 있는 것인지?

그렇지 않습니다...

Incremental structure from motion

Photo Tourism

SfM - Failure cases

이런 현상을 Necker reversal이라 하는 것 같은데, 시각적으로 방향에 대한 단서를 얻기 어려운 경우를 일컫습니다.

반복되는 양상이 많은 건축물의 경우에도 그렇습니다. 사람이 만든 scene이 대칭인 경우입니다.

SfM applications

3D modeling
Surveying
Robot navigation and mapmaking
Virtual and augmented reality
Visual effects (Matching moving)

VR & AR

강의 슬라이드를 기반으로 한 내용 정리입니다. 혹여 잘못된 내용 있으면 댓글로 알려주세요. 감사합니다.

728x90

저작자표시 비영리 변경금지 (새창열림)

'AI > Computer Vision' 카테고리의 다른 글

[CS5670] Lecture 16: Multi-view stereo (0)	2022.08.25
[CS5670] Lecture 14: Light & Perception (0)	2022.08.16
[CS5670] Lecture 13: Stereo (0)	2022.08.10
[보충] Camera model and Vanishing points (0)	2022.08.09
[CS5670] Lecture 12: Single-view modeling (0)	2022.08.09

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

글 보관함

250x250

Deep-Dive AI

티스토리 뷰

[CS5670] Lecture 18: Structure from motion

Structure from motion

Large-scale structure from motion

Two-views

What about more than two views?

Structure from motion

What we've seen so far

Input

Triangulation & camera calibration

First step: how to get correspondence?

Image connectivity graph

Correspondence estimation

Input to Structure from Motion

Structure from motion

Problem size

Structure from motion

Incremental structure from motion

Photo Tourism

SfM - Failure cases

SfM applications

VR & AR

'AI > Computer Vision' 카테고리의 다른 글

티스토리툴바