{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# A Simple Convolutional Neural Network (using Keras)\n",
"\n",
">## Explaining the code \n",
"> >#### Original Articles : \n",
"* [An Intuitive Guide to CNN](https://www.freecodecamp.org/news/an-intuitive-guide-to-convolutional-neural-networks-260c2de0a050/)\n",
"* [CNN : How to Build one in Keras & PyTorch](https://missinglink.ai/guides/neural-network-concepts/convolutional-neural-network-build-one-keras-pytorch/)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Import the required libraries:\n",
"1. **Numpy**:\n",
"[NumPy](https://numpy.com.cn) is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.\n",
"\n",
"1. **Keras**:[Keras](https://keras.org.cn/) is a high-level neural networks API, written in Python.\n",
" * Layers\n",
" * **Activation** : This is a method that applies an activation function to the output.\n",
" * **MaxPool2D** : This method creates a Pooling layer to implement Max Pooling operation.\n",
" * **Conv2D** : This method is used to create a convolution layer.\n",
" * **Flatten** : This method flattens (converting into 1-Dimension) the input without affecting the batch size.\n",
" * **Dense** : This method is used to create a fully connected neural network. \n",
" * Models\n",
" * **Sequential**: The [Sequential model](https://keras.org.cn/getting-started/sequential-model-guide/) is a linear stack of layers. We can create a Sequential model by passing a list of layer instances to the constructor or just by simply adding layers via the **.add()** method."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"from keras.layers import Conv2D, Activation, MaxPool2D, Flatten, Dense\n",
"from keras.models import Sequential"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The Convolutional Neural Network Architecture\n",
"A typical Convolutional Neural Network consists of the following parts:\n",
"\n",
"### 1. The Hidden Layers/ Feature Extraction Part \n",
"#### (The Covolution and Pooling Layer)\n",
"In this part of the network a series of convolution and pooling operations takes place, which are used for detecting different features of the input image.\n",
"\n",
"* In the code snippet below, we initialise a **Sequential model**. At first we dont provide the constructor with the layers,but rather we go on adding the layers to the model by using the **.add()** fuction as can be seen from the following snippets of code. \n",
"* This model lets us build a model by adding on one layer at a time."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Images fed into this model are 28 x 28 pixels with 3 channels\n",
"img_shape = (28,28,1)\n",
"# Set up the model\n",
"model = Sequential()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* In the above line of codes,at first we define the **img_shape**.Here the **img_shape** denotes the input image shape to the model. It is important as the model needs to knpw what input shape it should expect to be fed.**Therefore the first layer in a Sequential model needs to always receive information about the input shape.**\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Add convolutional layer with 3, 3 by 3 filters\n",
"model.add(Conv2D(3,kernel_size=3,input_shape=img_shape))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### [Conv2D()](https://keras.org.cn/layers/convolutional/)\n",
"This function creates a 2D Convolution Layer. \n",
"\n",
"#### What does a Convolution Layer do?\n",
"* A convolution layer scans a source image with a filter to extract features which may be important for classification. (The filter is also called the convolution kernel.)\n",
"* The kernel also contains weights,which are tuned during the training of the model to achieve the most accurate predictions.\n",
"* **Parameters** :\n",
" * The first parameter (3 in this case), defines the number of neural nodes in each layer.\n",
" * **kernel_size** defines the filter size—this is the area in square pixels the model will use to scan the image. Kernel size of 3 means the model looks at a square of 3×3 pixels at a time.\n",
" * **input_shape** is the pixel size of the images.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### [MaxPool2D()](https://keras.org.cn/layers/pooling/)\n",
"This function creates a 2D Pooling Layer.\n",
"\n",
"#### What does a Pooling Layer do?\n",
"\n",
"* Its function is to **progressively reduce the spatial size of the representation to reduce the amount of parameters and computation in the network**, and hence to also **control overfitting**.\n",
"* The Pooling Layer **operates independently on every depth slice of the input and resizes it spatially**, using the MAX operation.\n",
"\n",
"### Activation()\n",
"The **Activation()** method is used to apply a specific activation function to the output of the Pooling layer.\n",
"Here a **relu** activation function is added to the output layer."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Add relu activation to the layer \n",
"model.add(Activation('relu'))\n",
"#Pooling\n",
"model.add(MaxPool2D(2))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2. Classification Layer\n",
"After the convolution layer, there is classification layer that consists of fully connected layers, where neurons of one layer are connected with every activations from the previous layer.\n",
"The thing about the fully connected layers is that it can only be fed **1-Dimensional** data. With the output data from the previous layer being **3-Dimensional** , it needs to the **flatten out**(converted into 1-Dimensional) before it can be fed into the Classification layer.\n",
"\n",
"For this a **Flatten** layer is added, which takes the output of the convolution layer and turns it into a format that can be used by the densely connected neural layer."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Fully connected layers\n",
"# Use Flatten to convert 3D data to 1D\n",
"model.add(Flatten())\n",
"# Add dense layer with 10 neurons\n",
"model.add(Dense(10))\n",
"# we use the softmax activation function for our last layer\n",
"model.add(Activation('softmax'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Dense()\n",
"The final layer of the model is of type 'Dense', a densely-connected neural layer which will give the final classification/prediction.\n",
"* The parameter(in this case 10), refers to **the number of output nodes.**\n",
"* After that a **softmax** activation function is added to the output layer. It will take the output of the Dense layer and convert it into meaningful probabilities that will ultimately help in making the final prediction.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Summary()\n",
"Now just to get an overview of the model we have created ,we will use the **summary() function**. This will list out the details regarding the Layer,the Output shape and the number of parameters."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# give an overview of our model\n",
"model.summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Compilation:\n",
"Before the training process, we have to put together a learning process in a particular form.This is done via compile(). It consists of 3 elements: \n",
"* **An Optimiser** : string identifier of an existing optimizer/a call to an optimizer function\n",
"* **A loss function** : string identifier of an existing loss function/a call to a loss function\n",
"* **A list of metric** : string identifier of an existing metric or a call to metric function\n",
" > * The metrics defines how the success of the model is evaluated.\n",
" > * we will use the ‘accuracy’ metric to calculate an accuracy score on the testing/validation set of images."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Our model is complete.\n",
"\n",
"### The Training Phase:\n",
"Now we will train the model.\n",
"> We will use the MNIST dataset which is a benchmark deep learning dataset, containing 70,000 handwritten numbers from 0-9."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#importing the datasets\n",
"from keras.datasets import mnist\n",
"#loading the datasets\n",
"(X_train, y_train), (X_test, y_test) = mnist.load_data()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"plt.imshow(X_train[0])\n",
"print(X_train[0].shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### In the code snippet below, reshaping is done.\n",
"## But WHY?\n",
"**Reshaping of the two sets of images, X_train and X_test is done so that their shape matches the shape expected by our CNN model.**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"X_train = X_train.reshape(60000,28,28,1)\n",
"X_test = X_test.reshape(10000,28,28,1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### One-hot encoding:\n",
"In one-hot encoding we create a column for each classification category, with each column containing binary values indicating if the current image belongs to that category or not."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from keras.utils import to_categorical\n",
"y_train = to_categorical(y_train)\n",
"y_test = to_categorical(y_test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### [fit( )](https://keras.org.cn/models/sequential/)\n",
">* This method trains the model for a fixed number of epochs.\n",
"* An epoch is an iteration over the entire x and y data provided. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model.fit(X_train, y_train,validation_data=(X_test,y_test) ,epochs=10, verbose=2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Now its time to test our Model!!\n",
"We can test our model against a/some specific inputs with the help of the **predict( )** function as done in the code snippet below.\n",
"> * **Output Format**\n",
" The output will consists of 10 probabilities, each representing the probability of being a particular digit from 0-9."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model.predict(X_test[:4])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Improving the displaying of the results:\n",
"> By directly displaying the predicted number instead of the probabilities.(i.e. just refining the output. )"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def return_Digit(x):\n",
" a=np.where(x==max(x))\n",
" print(int(a[0]))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prediction=model.predict(X_test[:4])\n",
"np.apply_along_axis(return_Digit,axis=1,arr=prediction)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### ---------------------------------------------------------------------------"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Numpy: NumPy 是 Python 编程语言的一个库,它为大型多维数组和矩阵添加了支持,以及大量用于操作这些数组的高级数学函数。
Keras: Keras 是一个用 Python 编写的用于构建神经网络的高级 API。
import numpy as np
from keras.layers import Conv2D, Activation, MaxPool2D, Flatten, Dense
from keras.models import Sequential
典型的卷积神经网络由以下部分组成
在网络的这部分,进行一系列卷积和池化操作,用于检测输入图像的不同特征。
# Images fed into this model are 28 x 28 pixels with 3 channels
img_shape = (28,28,1)
# Set up the model
model = Sequential()
# Add convolutional layer with 3, 3 by 3 filters
model.add(Conv2D(3,kernel_size=3,input_shape=img_shape))
此函数创建一个 2D 卷积层。
此函数创建一个 2D 池化层。
**Activation()** 方法用于将特定的激活函数应用于池化层的输出。这里将 **relu** 激活函数添加到输出层。
# Add relu activation to the layer
model.add(Activation('relu'))
#Pooling
model.add(MaxPool2D(2))
在卷积层之后,有一个分类层,它包含全连接层,其中一层中的神经元与前一层中的每个激活连接。全连接层的问题是它只能接收 **一维** 数据。由于前一层的输出数据是 **三维** 的,因此在将其馈送到分类层之前需要将其 **展平**(转换为一维)。
为此,添加了一个 **Flatten** 层,它获取卷积层的输出并将其转换为全连接神经层可以使用的一种格式。
#Fully connected layers
# Use Flatten to convert 3D data to 1D
model.add(Flatten())
# Add dense layer with 10 neurons
model.add(Dense(10))
# we use the softmax activation function for our last layer
model.add(Activation('softmax'))
模型的最后一层是 'Dense' 类型,一个全连接神经层,它将给出最终的分类/预测。
现在,为了概述我们创建的模型,我们将使用 **summary() 函数**。这将列出有关层、输出形状和参数数量的详细信息。
# give an overview of our model
model.summary()
在训练过程之前,我们必须以特定的形式将学习过程组合在一起。这通过 compile() 完成。它包含 3 个元素
- 指标定义了如何评估模型的成功。
- 我们将使用 'accuracy' 指标来计算测试/验证图像集的准确性得分。
model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy'])
我们的模型已经完成。
现在我们将训练模型。
我们将使用 MNIST 数据集,这是一个深度学习基准数据集,包含 70,000 个从 0-9 的手写数字。
#importing the datasets
from keras.datasets import mnist
#loading the datasets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
import matplotlib.pyplot as plt
plt.imshow(X_train[0])
print(X_train[0].shape)
对两个图像集 X_train 和 X_test 进行重塑是为了使它们的形状与我们的 CNN 模型期望的形状匹配。
X_train = X_train.reshape(60000,28,28,1)
X_test = X_test.reshape(10000,28,28,1)
在独热编码中,我们为每个分类类别创建一个列,每个列包含二进制值,指示当前图像是否属于该类别。
from keras.utils import to_categorical
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
model.fit(X_train, y_train,validation_data=(X_test,y_test) ,epochs=10, verbose=2)
我们可以使用 **predict( )** 函数对特定输入进行测试,如以下代码片段所示。
- 输出格式 输出将包含 10 个概率,每个概率代表是 0-9 中特定数字的概率。
model.predict(X_test[:4])
通过直接显示预测的数字而不是概率。(即只是细化输出。)
def return_Digit(x):
a=np.where(x==max(x))
print(int(a[0]))
prediction=model.predict(X_test[:4])
np.apply_along_axis(return_Digit,axis=1,arr=prediction)