Polytechnic School of Engineering of New York University
Dept. Electrical & Computer Engineering
EL-GY 6123 ---- Image and Video Processing, Spring
2016
http://eeweb.poly.edu/~yao/EL6123
Course Description: This course introduces fundamentals of image and video
processing, including color image capture and representation; color coordinate
conversion; contrast enhancement; spatial domain filtering (linear convolution,
median and morphological filtering); two-dimensional (2D) Fourier transform and
frequency domain interpretation of linear convolution; 2D Discrete Fourier
Transform (DFT) and DFT domain filtering; image sampling and resizing;
geometric transformation and image registration; video motion characterization
and estimation; video stabilization and panoramic view generation; basic
compression techniques (entropy coding, vector quantization, predictive coding,
transform coding); JPEG image
compression standard; wavelet transform and JPEG2000 standard; video
compression using adaptive spatial and temporal prediction; video coding
standards (MPEGx/H26x); Stereo and multi- view image
and video processing (depth from disparity, disparity estimation, video
synthesis, compression). Students will learn to implement selected algorithms
in MATLAB or C-language.
Prerequisites: Graduate status. EL-GY 6113 and EL-GY 6303 preferred but not
required. Undergraduate students must have completed EE-UY 3054 Signals and
systems and EE-UY 2233 Probability.
Instructor: Professor Yao Wang, MTC2 Room 9.122, (646)-997-3469, Email: yw523 at nyu dot edu. Homepage: http://eeweb.poly.edu/~yao
Course Schedule: Monday 10:30 AM-1:00 PM, MTC 9.011
Office
Hour: Wed. 4-6PM and Thur 4-6 PM
or appointment by email.
Text Book:
1. (Required) Y. Wang, J. Ostermann, and Y.Q.Zhang, Video Processing and Communications. Prentice
Hall, 2002. Link
2. (Required) R. C. Gonzalez and R. E.
Woods, Digital Image Processing, Prentice Hall, (3rd Edition) 2008. ISBN number 9780131687288. Link
3. (Optional) J. W. Woods, “Multidimensional signal,
image and video processing and coding,” Academic Press / Elsevier, 2nd
ed, 2012. Link
Grading Policy: Midterm Exam: 40%, Final Exam: 40%, Programming
assignments: 10%, Written homework: 10%.
Homework Policy: Homework problems will be assigned every week and are
due the following week at the lecture time. Late submissions are not accepted
except in exceptional cases. Students
can work in teams but must submit their homeworks
separately. However, if you worked with others to derive your solutions, you
should write the name(s) of the students that you worked with on the top of
your hand-in homeworks. Solutions will be provided
when the graded homeworks are returned. Please note that both written and Matlab assignments may be graded selectively (i.e. not all
problems are graded in each assignment). But we will provide solution to all
problems.
Middelbury
Stereo Image Database
Links to resources (lecture notes and sample exams) in previous
offerings:
Other Useful Links
Tentative Course Schedule
·
Week 1 (1/25):
Color perception and mixing, color image and video capture and representation,
color coordinate conversion, concept of histogram, contrast enhancement and
other point-wise operations. Lecture
Note (uploaded 1/24/2016)
·
Week 2 (2/1) : Review of 1D Fourier transform
and convolution. Concept of spatial frequency. Continuous and Discrete
Space 2D Fourier transform. 2D convolution and its interpretation in frequency
domain. Implementation of 2D convolution. Frequency response. Lecture note (uploaded 1/29/2016)
·
Week 3 (2/8): Linear filtering (2D
convolution) for noise removal, image sharpening and edge detection. Median
filtering and morphological filtering. Lecture
note (uploaded 2/7/2016)
·
2/15 No classes
·
Week 4 (2/22): Image sampling and
resizing. Design of interpolation filters.
Geometric transformation. Image registration and warping. Image
morphing. Lecture note (uploaded 2/21/2016)
·
Week 5 (2/29): Basics about digital video:
temporal frequency due to motion, frequency response of the human visual
system, video sampling, moving object detection and tracking. Lecture note (uploaded 3/7/2016)
·
Week 6 (3/7) Motion estimation: 3D and 2D motion modeling, optical flow
equation, block matching, fractional-pel block
matching, multi-resolution block matching. Lecture
note (uploaded 3/18/2016)
·
3/14 – 3/18 Spring break
·
Week 7 (3/21): Global motion estimation. Video stabilization,
panoramic video generation, image blurring caused by motion and deblurring. Lecture
note (uploaded 3/18/2016)
·
Week 8 (3/28): 10:00AM-12:30PM: Midterm exam
·
Week 9 (4/4) Lossless image compression: The
concept of entropy and Huffman coding, Arithmetic coding, Context based
arithmetic coding of bilevel images. Quantization:
scalar and vector quantization, Minimal MSE quantizer
design, LBG algorithm for VQ. Lecture
note (updated 4/8/2016)
·
Week 10
(4/11) Image representation using unitary transforms. Transform coding. JPEG image compression
standard. Lecture note (updated
4/10/2016)
·
Week 11 (4/18) Image representation using wavelet transform;
concept of layered coding.
JPEG2000 image compression standard. Lecture note (updated 4/25/2016)
·
Week 12 (4/25) Predictive
Coding. Video coding:
motion compensated prediction and interpolation, adaptive spatial prediction,
block-based hybrid video coding, rate-distortion optimized mode selection, rate
control, Group of pictures (GoP) structure, tradeoff
between coding efficiency, delay, and complexity. Lecture note (updated 4/25/2016)
·
Week 13 (5/2) Overview of video coding standards
(AVC/H.264, HEVC/H.265); Layered coding: general
concept and H.264/SVC. Lecture note
(updated 5/7/2016)
·
Week 14 (5/9) Stereo and multiview
video: depth from disparity, disparity estimation, stereo image and video
compression, multiview video compression, view
synthesis. Stereo and multiview display. Depth
camera. Lecture note (updated 5/10/2016)
·
Week 15 (5/16) Final Exam
Sample exams:
Last updated: 5/20/2016,
Yao Wang