Course ID 학수번호 | Course Title 교과목명 | Classification 과목구분 | 학점-강의-실습 | 선수과목 | 수업학기 |
AUE8070 | Applied Mathematics for Deep Learning 딥러닝수학 | Elective 전공선택 | 3-3-0 | 없음 | 1학기 |
Applied Mathematics for Deep Learning/딥러닝수학
Course Information 교과목
Lecturer 교강사
Affilication 소속 | Name 성명 | Contact 연락처 | |
Department of Automotive Engineering 미래자동차공학과 | Seungjae Min 민승재 | (02) 2220-0457 | seungjae@hanyang.ac.kr |
Course Description 교과목 개요
Deep Learning theory is a field that intersects linear algebra, probability theory and statistics, multivariate calculus, and algorithms, complex optimizations arising from learning iteratively from data and finding hidden insights which can be used to build intelligent applications. Despite the immense possibilities of Machine and Deep Learning, a thorough mathematical understanding of many of these techniques is necessary for a good grasp of the inner workings of the algorithms and getting good results.
딥러닝 이론은 반복적으로 데이터를 학습하고 지능형 애플리케이션을 구축하는 데 사용될 수있는 숨겨진 인사이트를 발견하기 위한 선형대수학, 확률론 및 통계, 다변수 미적분학, 알고리즘 및 복잡도 최적화 등이 교차하는 분야이다. 머신러닝 및 딥러닝의 엄청난 가능성에도 불구하고, 알고리즘의 내부 작동을 잘 파악하고 좋은 결과를 얻으려면 이러한 기술 중 많은 부분을 수학적으로 완전히 이해해야한다.
Course Objective 수업 목표
The goal of the course is to understand applied mathematics identifies the key quantities in Deep Learning.
(1) 정확도, 트레이닝 시간, 모델 복잡도, 파라미터 수 및 피쳐 (features) 수에 대한 고려를 포함하는 올바른 알고리즘 선택, (2) 파라미터 설정과 검증 (validation) 전략 선택, (3) 편향 분산 (bias-variance)의 트레이드오프의 이해를 기반으로한 언더피팅 (underfitting)과 오버피팅 (overfitting)의 식별, (4) 올바른 신뢰 구간과 불확실성 추정을 위하여 필요한 수학을 학습하고 기술을 이해한다.
Textbook 교재
G. Strang, Linear Algebra and Learning from Data, Wellesley-Cambridge Press, 2019 (Book website: https://math.mit.edu/~gs/learningfromdata/)
References 부교재
N/A
Evaluation 평가방법
Midterm 중간 | Final 기말 | Attendance 출석 | Homework 과제 | Participation 수업참여도 | Project 프로젝트 | Total 총 |
40 | 40 | 0 | 20 | 0 | 0 | 100 |
Schedule 강의계획
Week 주 | Topics 주제 |
1 | Linear Algebra: Matrix, Space |
2 | Linear Algebra: SVD, PCA |
3 | Linear Algebra: Least Squares |
4 | Probability and Statistics |
5 | Optimization: Minimum Problems |
6 | Optimization: Linear Programming |
7 | Optimization: Stochastic Gradient Descent |
8 | Midterm Exam |
9 | Learning from Data: Structure of Neural Nets for Deep Learning |
10 | Learning from Data: Convolutional Neural Network (CNN) |
11 | Learning from Data: Distance Matrices, Clustering |
12 | Linear Regression Model |
13 | Logistics Regression Model: Two-value Classification |
14 | Logistics Regression Model: Multi-value Classification |
15 | Deep Learning Model |
16 | Final Exam |
Week | Date | Lecture | Textbook | Note | Homework ftp://cdl.hanyang.ac.kr | Due |
1 | 3/16 | Overview, Linear Algebra in a Nutshell | overview Linear Algebra in a Nutshell | MATLAB Tutorial | ||
3/19 | [Applied Linear Algebra] Four fundamental subspaces | I.1~I.3 | in-class note | |||
2 | 3/23 | Summary of Previous Lecture Orthogonal matrices | I.4~I.5 | review (Part I-1) in-class note | ||
3/25 | Eigenvalues and Eigenvectors Positive Definite and Semidefinite Matrices | I.6~I.7 | in-class note | HW#1 Problems | 4/1 | |
3 | 3/30 | Positive Definite and Semidefinite Matrices Singular Value Decomposition (SVD) | I.7~I.8 | review (Part I-2) in-class note | ||
4/1 | Singular Value Decomposition (SVD) Principal Components Norms of Vectors and Matrices | I.8~I.9 I.11 | in-class note | HW#2 Problems | 4/8 | |
4 | 4/6 | Norms Principal Components | I.11, I.9 | in-class note | ||
4/8 | Principal Components Ax=b | review (Part I-3) / review-note in-class note | ||||
5 | 4/13 | [Large Matrices] Least Squares | II.2 | in-class note | HW#3 Problems | 4/20 |
4/15 | Election Day (no class) | |||||
6 | 4/20 | Computing Eigenvalues and Singular Values | II.1, II.2 | in-class note | ||
4/22 | Randomized Linear Algebra | II.4 | review (Part II) / review (Part II) note in-class note | |||
7 | 4/27 | [Low Rank] Low Rank Change in A and Its Inverse | III.1 | in-class note WolframAlpha: Computational Intelligence | ||
4/29 | Derivatives of Inverse, Eigenvalues and Singular Values | III.2 | in-class note | |||
8 | 5/4 | Interlacing Eigenvalues and Low Rank Signals Rapidly Decaying Singular Values | III.2, III.3 | in-class note | ||
5/6 | Rapidly Decaying Singular Values | III.3 | in-class note | |||
9 | 5/11 | [Statistics] Mean, Variance, Covariance Optimization | V.1, V.3 | in-class note | ||
5/13 | [Optimization] Gradient Descent | VI.4 | in-class note Related Link: Gradient Descent | |||
10 | 5/18 | Accelerating Gradient Descent Linear Programming(LP) | VI.4 VI.3 | in-class note Related Link: Momentum | ||
5/20 | Stochastic Gradient Descent(SGD) | VI.5 | in-class note Related Link: Stochastic Gradient Descent | HW#4 Problems HW#5 Computations | 5/27 6/3 |
|
11 | 5/25 | [Learning from Data] Construction of Deep Neural Network | VI.5 VII.1 | An overview of gradient descent optimization algorithms Neural Network Playground / Problems in-class note | ||
5/27 | Construction of Deep Neural Network | VII.1 | The Functions of Deep Learning Deep Learning: An Introduction for Applied Mathematicians in-class note | HW#6 Problems | 6/3 | |
12 | 6/1 | Backpropagation and the Chain Rule | VII.3 | Calculus on Computational Graphs: Backpropagation in-class note | ||
6/3 | Loss Function | Teaching Calculus to a Deep Learner in-class note | ||||
13 | 6/8 | Convolution | ImageNet classification with deep convolutional neural networks in-class note | |||
6/10 | Convolution Circulant Matrix, Fourier Matrix | ImageNet classification with deep convolutional neural networks in-class note | ||||
14 | 6/15 | CNN, ConvNet | VII.2 | Convolutional Neural Network | ||
6/17 | Final Exam (10:30~12:00 @ Rm 102) | |||||
15 | 6/22 | Training Neural Networks | Training Neural Network |
PAST EXAM
2020 | ||||||
Midterm | ||||||
Final | exam |
[DATA]
MNIST database of handwritten digits (Yann LeCun, Courant Institute, NYU)
DELVE (University of Toronto, Data for Evaluating Learning in Valid Experiments)
UC Irvine, Machine Learning Repository
[Codes for Machine Learning]
Caffe: Convolutional Architecture for Fast Feature Embedding (2014)
Theano: A Python framework for fast computation of mathematical expression (2016)