[CS5670] Lecture 0: Intro to Computer Vision

    728x90

     

    1. What is computer vision?
    2. Why study computer vision?
    3. Course overview
    4. Images & Image filtering [time permitting]

     

    1. What is computer vision?

     

    Goal of computer vision:

    percieve the story behind the picture

     

    Compute properties of the world

    • 3D shape
    • Names of people or objects
    • What happened?

     

    Can computers match human perception?

    • Yes and No (maninly no)
      • computers can be better at 'easy' things
      • humans are better at 'hard' things
    • But huge progress
      • Accelerating in the last five years due to deep learning
      • What is considered 'hard' keeps chaning

     

    Human perception has its shortcomings

    But humans can tell a lot about from a little information...

     

    The goal of computer vision

     

    • Compute the 3D shape of the world
    • Recognize objects and people
    • 'Enhance' images
    • Forensics
    • Improve photos ('Computational Photography')
      • Super-resolution
      • Low-light photography
      • Depth of field on cell phone camera
      • Inpainting / image completion

     

    2. Why study computer vision?

     

    • Bilions of images/videos captured per day
    • Huge number of potential applications
    • The next slide show the current state of the art(SOTA)

     

    • Optical character recognition (OCR)
    • Face detection (Nearly all cameras detect faces in real time (Why?)
    • Face analysis and recognition
    • Vision-based biometrics
    • Login without a password
      • Fingerprint scanners on many new smartphones and other devices
      • Face unlock on Apple iPhone X
    • Bird identification
    • Special effects: shape capture
    • Special effects: motion capture
    • 3D face tracking w/ consumer cameras
    • Image synthesis
      • Which face is real?
    • Sports
    • Smart cars
    • Self-driving cars
    • Robotics
    • Medical imaging
    • Virtual & Auamented Reality

     

    Current state of the art

    • You just saw many examples of current systems.
      • Many of these are loss than 5 years old
    • Computer vision is an active research area, and rapidly changing
      • Many new apps in the next 5 years
      • Deep learning powering many modern applications
    • Many startups acrosss a dizzying array of areas
      • Deep learning, robotics, autonomous vehicles, medical imaging, construction, inspection, VR/AR, ...

     

    Why is computer vision difficult?

    • Viewpoint variation
    • Illumination
    • Scale
    • Intra-class variation
    • Background clutter
    • Motion
    • Occlusion

     

    Challenges: local ambiguity

    But there are lots of visual cues we can use...

     

    Bottom line

    • Perception is an inherently ambiguous problem
      • Many different 3D scenes could have given rise to a given 2D image
      • We often must use prior knowledge about the world's structure

     

    Project based course whose goal is to teach you the basics of computer vision image processing, geometry,
    recognition in a hands on way.

     

    Course requirements

    • Prerequisites
      • Data structures
      • Good working knowledge of Python programming
      • Linear algebra
      • Vector calculus
    • Course does not assume prior imaging experience
      • computer vision, image processing, graphics, etc.

     

    3. Course overview (tentative)

    1. Low-level vision
      • Image processing, edg detection, feature detection, cameras, image formation
    2. Geometry and algorithms
      • projective geometry, stereo, structure from motion, optimization
    3. Recognition
      • face detection/recognition, category recognition, segmentation

     

    4. Images & Image filtering [time permitting]

     

    1. Low-level vision

    Basic image processing and image formation

    • Filtering, edge detection
    • Feature extraction
    • Image formation

    2. Geometry

    • Projective geometry
    • Stereo vision
    • Multi-view stereo
    • Structure from motion

     

    3. Recognition

    • Image classification
    • Object detection
    • Convolutional Neural Networks

     

     

     

     

    Reference: CS5670: Intro to Computer Vision (Cornell Tech)

    728x90

    댓글