Introduction to Deep Learning - Lecture 1 | Massachusetts Institute of Technology

University:
Massachusetts Institute of Technology
Course:
6.S191 | Introduction to Deep Learning
Academic year:

2025
Views:

350

Pages:

2
Author:

customer-8369914

MIT 6.S191 – Introduction to Deep Learning (Summary Notes) Lecture 1: Foundations of Deep Learning 1. 2. What’s deep learning? Why deep learning Hand engineered features are time consuming, brittle and not scalable in practice.  3. Perceptron: Simplest neural unit. Takes inputs xi , multiplies by weights wi , applies activation function. 4. Common Activation Functions:  Sigmoid function �(�) = 1 1+ �−� �(�) =  Hyperbolic Tangent  Rectified Linear Unit (ReLU) �� − �−� �� + �−� �(�) = max(0, z) 5. Neural Networks: Multiple layers of perceptrons (input, hidden, output). 6. Forward Propagation: y = f (Wx+b) 7. Loss Functions: Quantify error (e.g., MSE, cross-entropy). Empirical loss: measures the total loss over entire dataset  Binary Cross Entropy loss: used with models that output a probability between 0 and 1  Mean Squared Error Loss: used with regression models that output continuous real numbers  8. Loss Optimization: find the network weights that achieve the lowest loss W = argmin 1 � � �=1 � �(��; �), �� 9. Gradient Descent: Updates weights to minimize loss. 1) 2) 3) 4) Process Initialize weights randomly Loop until convergence Compute gradient Update weights 5) return weights 1) 2) 3) 4) 5) Algorithm SGD Adam Adadelta Adagrad RMSProp 10. Backpropagation: Uses chain rule to compute gradients. 11. Learning rate 1) Setting the learning rate a) Large learning rates overshoot, become unstable and diverge b) stable learning rates converge smoothly and avoid minima 2) Adaptive learning rates - no longer fixed 12. The Problem of underfitting and overfitting 13. Regularization  Technique that constrains the optimization problem to discourage complex model  Improve generalization of model on unseen data 1) Regularization 1 : Dropout During training, randomly set some activations to 0  2) Regularization 2 : Early stopping Stop training before we have a chance to overfit. 

Introduction to Deep Learning - Lecture 1

Related Documents

Recommended Documents

Description

Related Documents

Almost There!

Free up your schedule!

Take 5 seconds to unlock