Guillaume Pelletier-Auger - Small neural network

Small neural network

Notes taken while learning how to build a neural network for artistic purposes.

January 27, 2019

Below are some notes that I took while following Daniel Shiffman’s video tutorials on neural networks, which are heavily inspired by Make Your Own Neural Network, a book written by Tariq Rashid. The concepts and formulas here are not my original material, I just wrote them down in order to better understand and remember them.

Feedforward algorithm

The calculations made by one the network’s layers, which takes into account its synaptic “weights”, can be represented by the matrix product written down below, in which $h$ represents an intermediary layer (or “hidden layer”) of the network, $w$ represents the weights and $x$ represents the inputs. In this inverted notation, ${⃗ w}_{i j}$ indicates the weight from $j$ to $i$ .

[\begin{matrix} h_{1} h_{2} \end{matrix}] = [\begin{matrix} w_{11} & w_{12} & w_{13} w_{21} & w_{22} & w_{23} \end{matrix}] \cdot ⎡ ⎢ ⎣ \begin{matrix} x_{1} x_{2} x_{3} \end{matrix} ⎤ ⎥ ⎦

This product can also be represented thusly:

\begin{matrix} h_{1} & = (w_{11} \times x_{1}) + (w_{12} \times x_{2}) + (w_{13} \times x_{3}) h_{2} & = (w_{21} \times x_{1}) + (w_{22} \times x_{2}) + (w_{23} \times x_{3}) \end{matrix}

And it’s also possible to simplify even more:

H_{i} = W_{i j} \cdot X_{i}

We must also add the bias $B$ , whose value is $1$ .

\begin{matrix} [\begin{matrix} h_{1} h_{2} \end{matrix}] & = [\begin{matrix} w_{11} & w_{12} & w_{13} & b_{1} w_{21} & w_{22} & w_{23} & b_{2} \end{matrix}] \cdot ⎡ ⎢ ⎢ ⎢ ⎣ \begin{matrix} x_{1} x_{2} x_{3} 1 \end{matrix} ⎤ ⎥ ⎥ ⎥ ⎦ h_{1} & = (w_{11} \times x_{1}) + (w_{12} \times x_{2}) + (w_{13} \times x_{3}) + b_{1} h_{2} & = (w_{21} \times x_{1}) + (w_{22} \times x_{2}) + (w_{23} \times x_{3}) + b_{2} H_{i} & = σ (W_{i j}^{I H} \cdot X_{i} + B_{i}^{H}) \end{matrix}

The sigmoid function will be used as the activation function:

σ (x) = \frac{1}{1 + e^{- x}}

The calculation of the output layer $Y$ will finally be done this way:

Y = σ (W_{i j}^{H O} \times H_{i} + B_{i}^{Y})

Backpropagation

Once the feedforward is done, we are able to calculater the error $e$ , which must then be sent from the output layer to the preceding layers, by backpropagation. Here, ${⃗ w}_{i j}$ represents the weight $w$ between the output layer $j$ and the hidden layer $i$ .

\begin{matrix} e_{h_{1}} & = (\frac{w_{11}}{w_{11} + w_{12}} \times e_{1}) + (\frac{w_{21}}{w_{21} + w_{22}} \times e_{2}) e_{h_{2}} & = (\frac{w_{12}}{w_{11} + w_{12}} \times e_{1}) + (\frac{w_{22}}{w_{21} + w_{22}} \times e_{2}) \end{matrix}

We will also simplify this calculation by not normalizing the weights before multiplying them with the error:

\begin{matrix} e_{h_{1}} & = w_{11} \times e_{1} + w_{21} \times e_{2} e_{h_{2}} & = w_{12} \times e_{1} + w_{22} \times e_{2} \end{matrix}

Which is equal to this matrix product:

[\begin{matrix} e_{h_{1}} e_{h_{2}} \end{matrix}] = [\begin{matrix} w_{11} & w_{21} w_{12} & w_{22} \end{matrix}] \cdot [\begin{matrix} e_{1} e_{2} \end{matrix}]

It should be noted that the weights matrix that was used during the feedforward was transposed to be used for the backpropagation.

\begin{matrix} W & = [\begin{matrix} w_{11} & w_{12} w_{21} & w_{22} \end{matrix}] W^{T} & = [\begin{matrix} w_{11} & w_{21} w_{12} & w_{22} \end{matrix}] \end{matrix}

Other resources

— Machine Learning for the Web, a course by Yining Shi at Itp (Nyu).
— How to build a Teachable Machine with TensorFlow.js, a tutorial by Nikhil Thorat (one of the developers of TensorFlow.js).
— Make your own neural network, by Tariq Rashid.
— Essence of Linear Algebra, a video series on the 3Blue1Brown YouTube channel.
— Neural Networks, another series by 3Blue1Brown.
— Essence of calculus, another series by 3Blue1Brown. (Chain rule, product rule.)
— Neural Networks and Deep Learning.
— Essential Math for Machine Learning, a course by Graeme Malcolm on edX.

Context

This blog post is part of my research project Towards an algorithmic cinema, started in April 2018. I invite you to read the first blog post of the project to learn more about it.