Polytechnic School of Engineering of New York University

Dept. Electrical & Computer Engineering
EL-GY 6123 ---- Image and Video Processing, Spring 2015
http://eeweb.poly.edu/~yao/EL6123


Course Description: This course introduces fundamentals of image and video processing, including color image capture and representation; color coordinate conversion; contrast enhancement; spatial domain filtering (linear convolution, median and morphological filtering); two-dimensional (2D) Fourier transform and frequency domain interpretation of linear convolution; 2D Discrete Fourier Transform (DFT) and DFT domain filtering; image sampling and resizing; geometric transformation and image registration; video motion characterization and estimation; video stabilization and panoramic view generation; basic compression techniques (entropy coding, vector quantization, predictive coding, transform coding);  JPEG image compression standard; wavelet transform and JPEG2000 standard; video compression using adaptive spatial and temporal prediction; video coding standards (MPEGx/H26x); Stereo and multi- view image and video processing (depth from disparity, disparity estimation, video synthesis, compression). Students will learn to implement selected algorithms in MATLAB or C-language.

Prerequisites: Graduate status. Undergraduate students must have completed EE-UY 3054 Signals and systems and EE-UY 2233 Probability.  EL-GY 6113 and EL-GY 6303 preferred but not required.

Instructor: Professor Yao Wang, MTC2 Room 9.122,  (718)-260-3469, Email:  yw523 at nyu dot edu. Homepage: http://eeweb.poly.edu/~yao

Course Schedule: Monday 1:30-4:00 PM

Office Hour: Monday 10-12AM,  Wed. 4-6PM or appointment by email.

Text Book: 

1.    Y. Wang, J. Ostermann, and Y.Q.Zhang, Video Processing and Communications. Prentice Hall, 2002.  Link

2.    J. W. Woods, “Multidimensional signal, image and video processing and coding,” Academic Press / Elsevier, 2nd ed, 2012. Link

Grading Policy:  Midterm Exam: 40%, Final Exam: 40%, Programming assignments: 10%, Written homework: 10%.

Homework Policy: Homework problems will be assigned every week and are due the following week at the lecture time. Late submissions are not accepted except in exceptional cases.  Students can work in teams but must submit their homeworks separately. However, if you worked with others to derive your solutions, you should write the name(s) of the students that you worked with on the top of your hand-in homeworks. Solutions will be provided when the graded homeworks are returned.  Please note that both written and Matlab assignments may be graded selectively (i.e. not all problems are graded in each assignment). But we will provide solution to all problems.

Sample Video Data

Sample Images

 

Middlebury Stereo Image Databse http://vision.middlebury.edu/stereo/  http://vision.middlebury.edu/stereo/data/

Links to resources (lecture notes and sample exams) in previous offerings:

·            EL 5123 Image Processing

·            EL 6123 Video Processing

Other Useful Links

 

Tentative Course Schedule

·            Week 1 (1/26): Color perception and mixing, color image and video capture and representation, color coordinate conversion, concept of histogram, contrast enhancement and other point-wise operations. Lecture Note (uploaded 1/26/2015)

 

·            Week 2 (2/2) : Review of 1D Fourier transform and convolution. Concept of spatial frequency. Continuous and Discrete Space 2D Fourier transform. 2D convolution and its interpretation in frequency domain. Implementation of 2D convolution. Frequency response.  Lecture note (uploaded 2/2/2015)

 

·            Week 3 (2/9): Linear filtering (2D convolution) for noise removal, image sharpening and edge detection. Median filtering and morphological filtering. Lecture note (uploaded 2/7/2015)

 

·            2/16  No classes

 

·            Week 4 (2/23): Image sampling and resizing. Design of interpolation filters.  Geometric transformation. Image registration and warping. Image morphing. Lecture note (uploaded 2/23/2015)

 

·            Week 5 (3/2): Basics about digital video: temporal frequency due to motion, frequency response of the human visual system, video sampling, moving object detection and tracking. Lecture note (uploaded 3/1/2015)

 

·            Week 6 (3/9) Motion estimation:  3D and 2D motion modeling, optical flow equation, block matching, fractional-pel block matching, multi-resolution block matching, deformable block matching, mesh-based motion estimation. Lecture note (uploaded 3/10/2015)

 

·            3/16 – 3/20  Spring break

 

·            Week 7 (3/23): Midterm

 

·            Week 8 (3/30) Global motion estimation. Video stabilization, panoramic video generation, image blurring caused by motion and deblurring. Lecture note (uploaded 4/3/2015)

 

·            Week 9 (4/6) Lossless image compression: The concept of entropy and Huffman coding, Arithmetic coding, Context based arithmetic coding of bilevel images. Quantization: scalar and vector quantization, Minimal MSE quantizer design, LBG algorithm for VQ. Lecture note (uploaded 4/7/2015)

 

·            Week 10 (4/13) Image representation using unitary transforms.  Transform coding. JPEG image compression standard. Lecture note (uploaded 4/13/2015)

 

·            Week 11 (4/20)  Image representation using wavelet transform; concept of layered coding. JPEG2000 image compression standard.  Lecture note (uploaded 4/17/2015)

 

·            Week 12 (4/27) Predictive Coding. Video coding: motion compensated prediction and interpolation, adaptive spatial prediction, block-based hybrid video coding, rate-distortion optimized mode selection, rate control, Group of pictures (GoP) structure, tradeoff between coding efficiency, delay, and complexity. Lecture note (uploaded 4/25/2015)

 

·            Week 13 (5/4) Overview of video coding standards (AVC/H.264, HEVC/H.265); Layered coding: general concept and H.264/SVC. Lecture note (uploaded 5/4/2015)

 

·            Week 14 (5/11) Stereo and multiview video: depth from disparity, disparity estimation, stereo image and video compression, multiview video compression, view synthesis. Stereo and multiview display. Depth camera. Lecture note (uploaded 5/9/2015)

 

·             Week 15 (5/18) Final Exam

 


Sample exams:

 

Last updated: 4/25/2015, Yao Wang