Introduction to Deep Learning

What is Deep Learning?

Why Deep Learning?

Hand-crafted features are time-consuming, brittle, and not scalable in practice. Deep learning allows us to learn the underlying features directly from the data.

The Perceptron

The structural building block of deep learning.

$$ \overbrace{\hat{y}}^{\text{Output}} = \overbrace{g\left(\underbrace{w_0}_{\text{Bias}} + \sum_{i=1}^m \underbrace{x_i}_{\text{Input}} \underbrace{w_i}_{\text{Weight}}\right)}^{\text{Non-Linear Activation Function}} $$
$$ \hat{y}=g\left(w_0+\boldsymbol{X}^T \boldsymbol{W}\right) $$ $$ \text{where:} \quad \boldsymbol{X}=\left[\begin{array}{c}x_1 \\ \vdots \\ x_m\end{array}\right] \quad \text{and} \quad \boldsymbol{W}=\left[\begin{array}{c}w_1 \\ \vdots \\ w_m\end{array}\right] $$

Activation Functions

Control activation and signaling between neurons for nonlinearity and adaptation to detect complex patterns in data.

\[\hat{y}=\textcolor{DarkGoldenrod}{g}\left(w_0+\boldsymbol{X}^T \boldsymbol{W}\right)\]

Types of Activation Function

Building Neural Networks with a Perceptron

A Perceptron Simplified Version

$$ z=w_0+\sum_{j=1}^m x_j w_j $$

Simplified Version of Multi-Output Perceptron

All inputs are connected to all outputs, these layers are called Dense.

$$ z_\textcolor{DarkGoldenrod}{i}=w_{0, \textcolor{DarkGoldenrod}{i}}+\sum_{j=1}^m x_j w_{j, \textcolor{DarkGoldenrod}{i}} $$

Single Layer Neural Network

Deep Neural Network

Loss Functions

The cost of prediction errors.

$$ \mathcal{L}\left(\underbrace{f\left(x^{(i)} ; \boldsymbol{W}\right)}_{\text{Prediction}}, \underbrace{y^{(i)}}_{\text{Actual}}\right) $$

Types of Loss Function

Optimization Algorithms

Optimization of neural network model parameters for loss function minimization.

Types of Optimization Algorithms




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Introduction to Datasets