Deep Learning for Computer Vision

Deep Learning for Computer Vision is a course on Convolutional Neural Networks for Visual Recognition. It is taught by Prof. Justin Johnson at University of Michigan, in winter 2022. The course is available on Course Website and YouTube.

I chose this lecture series instead of the popular CS231n course by Stanford University because first, it already includes most of the materials covered in CS231n and covers more advanced topics such as attention and 3D vision. Second, it is taught by Prof. Justin Johnson, who was a lecturer for CS231n in the past. I have watched his lectures on YouTube and found them very informative and easy to understand.

I also referred to YouTube lectures on deep learning from Prof. Seungsang Oh, Korea University. His lecture gives a different perspective on the same topic, which I found very helpful. I will use his content and materials to supplement my understanding of the course.

Topics

The first lecture is an introduction to deep learning for computer vision.

The rest of the lecture covers the following topics:

Lecture 2: Image Classification
Lecture 3: Linear Classifiers
Lecture 4: Regularization + Optimization
Lecture 5: Neural Networks
Lecture 6: Backpropagation
Lecture 7: Convolutional Networks
Lecture 8: CNN Architectures I
Lecture 9: Training Neural Networks I
Lecture 10: Training Neural Networks II
Lecture 11: CNN Architectures II
Lecture 12: Deep Learning Software
Lecture 13: Object Detection
Lecture 14: Object Detectors
Lecture 15: Image Segmentation
Lecture 16: Recurrent Networks
Lecture 17: Attention
Lecture 18: Vision Transformers
Lecture 19: Generative Models I
Lecture 20: Generative Models II
Lecture 21: Visualizing Models and Generating Images
Lecture 22: Self-Supervised Learning
Lecture 23: 3D vision
Lecture 24: Videos

I will also cover some lectures from past semesters, which covers hardware and software and reinforcement learning, etc.

Assignments

There are six assignments in the course:

PyTorch 101, k-Nearest Neighbor classifier [Instructions] [Code]
Linear Classifiers, Two-layer Neural Network, MNIST Challenge [Instructions]
Fully-Connected Neural Network, Convolutional Neural Network [Instructions]
One-Stage Detector, Two-Stage Detector [Instructions]
Image Captioning with Recurrent Neural Networks, Transformer model for simple arithmetic operations [Instructions]
Variational Autoencoder, Generative Adversarial Networks, Network Visualization, Style Transfer [Instructions]

I will post my solutions to the assignments in the future. Stay tuned for more updates on the Deep Learning for Computer Vision series!