The Algorithms logo
算法
关于我们捐赠

使用 Keras 的卷积神经网络

H
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# A Simple Convolutional Neural Network (using Keras)\n",
    "\n",
    ">## Explaining the code \n",
    "> >#### Original Articles :  \n",
    "* [An Intuitive Guide to CNN](https://www.freecodecamp.org/news/an-intuitive-guide-to-convolutional-neural-networks-260c2de0a050/)\n",
    "* [CNN : How to Build one in Keras & PyTorch](https://missinglink.ai/guides/neural-network-concepts/convolutional-neural-network-build-one-keras-pytorch/)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import the required libraries:\n",
    "1. **Numpy**:\n",
    "[NumPy](https://numpy.com.cn) is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.\n",
    "\n",
    "1. **Keras**:[Keras](https://keras.org.cn/) is a high-level neural networks API, written in Python.\n",
    "     * Layers\n",
    "        * **Activation** : This is a method that applies an activation function to the output.\n",
    "        * **MaxPool2D** : This method creates a Pooling layer to implement Max Pooling operation.\n",
    "        * **Conv2D** : This method is used to create a convolution layer.\n",
    "        * **Flatten** : This method flattens (converting into 1-Dimension) the input without affecting the batch size.\n",
    "        * **Dense** : This method is used to create a fully connected neural network. \n",
    "     * Models\n",
    "        * **Sequential**: The [Sequential model](https://keras.org.cn/getting-started/sequential-model-guide/) is a linear stack of layers. We can create a Sequential model by passing a list of layer instances to the constructor or just by simply adding layers via the **.add()** method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from keras.layers import Conv2D, Activation, MaxPool2D, Flatten, Dense\n",
    "from keras.models import Sequential"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The Convolutional Neural Network Architecture\n",
    "A typical Convolutional Neural Network consists of the following parts:\n",
    "\n",
    "### 1. The Hidden Layers/ Feature Extraction Part \n",
    "#### (The Covolution and Pooling Layer)\n",
    "In this part of the network a series of convolution and pooling operations takes place, which are used for detecting different features of the input image.\n",
    "\n",
    "* In the code snippet below, we initialise a **Sequential model**. At first we dont provide the constructor with the layers,but rather we go on adding the layers to the model by using the **.add()** fuction as can be seen from the following snippets of code. \n",
    "* This model lets us build a model by adding on one layer at a time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Images fed into this model are 28 x 28 pixels with 3 channels\n",
    "img_shape = (28,28,1)\n",
    "# Set up the model\n",
    "model = Sequential()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* In the above line of codes,at first we define the **img_shape**.Here the **img_shape** denotes the input image shape to the model. It is important as the model needs to knpw what input shape it should expect to be fed.**Therefore the first layer in a Sequential model needs to always receive information about the input shape.**\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Add convolutional layer with 3, 3 by 3 filters\n",
    "model.add(Conv2D(3,kernel_size=3,input_shape=img_shape))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### [Conv2D()](https://keras.org.cn/layers/convolutional/)\n",
    "This function creates a 2D Convolution Layer. \n",
    "\n",
    "####  What does a Convolution Layer do?\n",
    "* A convolution layer scans a source image with a filter to extract features which may be important for classification. (The filter is also called the convolution kernel.)\n",
    "* The kernel also contains weights,which are tuned during the training of the model to achieve the most accurate predictions.\n",
    "* **Parameters** :\n",
    "    * The first parameter (3 in this case), defines the number of neural nodes in each layer.\n",
    "    * **kernel_size** defines the filter size—this is the area in square pixels the model will use to scan the image. Kernel size of 3 means the model looks at a square of 3×3 pixels at a time.\n",
    "    * **input_shape** is the pixel size of the images.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### [MaxPool2D()](https://keras.org.cn/layers/pooling/)\n",
    "This function creates a 2D Pooling Layer.\n",
    "\n",
    "#### What does a Pooling Layer do?\n",
    "\n",
    "* Its function is to **progressively reduce the spatial size of the representation to reduce the amount of parameters and computation in the network**, and hence to also **control overfitting**.\n",
    "* The Pooling Layer **operates independently on every depth slice of the input and resizes it spatially**, using the MAX operation.\n",
    "\n",
    "### Activation()\n",
    "The **Activation()** method is used to apply a specific activation function to the output of the Pooling layer.\n",
    "Here a **relu** activation function is added to the output layer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Add relu activation to the layer \n",
    "model.add(Activation('relu'))\n",
    "#Pooling\n",
    "model.add(MaxPool2D(2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. Classification Layer\n",
    "After the convolution layer, there is classification layer that consists of fully connected layers, where neurons of one layer are connected with every activations from the previous layer.\n",
    "The thing about the fully connected layers is that it can only be fed **1-Dimensional** data. With the output data from the previous layer being **3-Dimensional** , it needs to the **flatten out**(converted into 1-Dimensional) before it can be fed into the Classification layer.\n",
    "\n",
    "For this a **Flatten** layer is added, which takes the output of the convolution layer and turns it into a format that can be used by the densely connected neural layer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Fully connected layers\n",
    "# Use Flatten to convert 3D data to 1D\n",
    "model.add(Flatten())\n",
    "# Add dense layer with 10 neurons\n",
    "model.add(Dense(10))\n",
    "# we use the softmax activation function for our last layer\n",
    "model.add(Activation('softmax'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Dense()\n",
    "The final layer of the model is of type 'Dense', a densely-connected neural layer which will give the final classification/prediction.\n",
    "* The parameter(in this case 10), refers to **the number of output nodes.**\n",
    "* After that a **softmax** activation function is added to the output layer. It will take the output of the Dense layer and convert it into meaningful probabilities that will ultimately help in making the final prediction.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Summary()\n",
    "Now just to get an overview of the model we have created ,we will use the **summary() function**. This will list out the details regarding the Layer,the Output shape and the number of parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# give an overview of our model\n",
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Compilation:\n",
    "Before the training process, we have to put together a learning process in a particular form.This is done via compile(). It consists of 3 elements: \n",
    "* **An Optimiser** : string identifier of an existing optimizer/a call to an optimizer function\n",
    "* **A loss function** : string identifier of an existing loss function/a call to a loss function\n",
    "* **A list of metric** : string identifier of an existing metric or a call to metric function\n",
    " > * The metrics defines how the success of the model is evaluated.\n",
    " > * we will use the ‘accuracy’ metric to calculate an accuracy score on the testing/validation set of images."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our model is complete.\n",
    "\n",
    "### The Training Phase:\n",
    "Now we will train the model.\n",
    "> We will use the MNIST dataset which is a benchmark deep learning dataset, containing 70,000 handwritten numbers from 0-9."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#importing the datasets\n",
    "from keras.datasets import mnist\n",
    "#loading the datasets\n",
    "(X_train, y_train), (X_test, y_test) = mnist.load_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "plt.imshow(X_train[0])\n",
    "print(X_train[0].shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### In the code snippet below, reshaping is done.\n",
    "## But WHY?\n",
    "**Reshaping of the two sets of images, X_train and X_test is done so that their shape matches the shape expected by our CNN model.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train = X_train.reshape(60000,28,28,1)\n",
    "X_test = X_test.reshape(10000,28,28,1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### One-hot encoding:\n",
    "In one-hot encoding we create a column for each classification category, with each column containing binary values indicating if the current image belongs to that category or not."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from keras.utils import to_categorical\n",
    "y_train = to_categorical(y_train)\n",
    "y_test = to_categorical(y_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### [fit( )](https://keras.org.cn/models/sequential/)\n",
    ">* This method trains the model for a fixed number of epochs.\n",
    "* An epoch is an iteration over the entire x and y data provided. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.fit(X_train, y_train,validation_data=(X_test,y_test) ,epochs=10, verbose=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Now its time to test our Model!!\n",
    "We can test our model against a/some specific inputs with the help of the **predict( )** function as done in the code snippet below.\n",
    "> * **Output Format**\n",
    "    The output will consists of 10 probabilities, each representing the probability of being a particular digit from 0-9."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.predict(X_test[:4])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Improving the displaying of the results:\n",
    "> By directly displaying the predicted number instead of the probabilities.(i.e. just refining the output. )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def return_Digit(x):\n",
    "    a=np.where(x==max(x))\n",
    "    print(int(a[0]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "prediction=model.predict(X_test[:4])\n",
    "np.apply_along_axis(return_Digit,axis=1,arr=prediction)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### ---------------------------------------------------------------------------"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
关于此算法

一个简单的卷积神经网络(使用 Keras)

解释代码

原文

导入所需的库

  1. Numpy: NumPy 是 Python 编程语言的一个库,它为大型多维数组和矩阵添加了支持,以及大量用于操作这些数组的高级数学函数。

  2. Keras: Keras 是一个用 Python 编写的用于构建神经网络的高级 API。

      • Activation : 这是一个将激活函数应用于输出的方法。
      • MaxPool2D : 此方法创建了一个池化层来实现最大池化操作。
      • Conv2D : 此方法用于创建卷积层。
      • Flatten : 此方法将输入展平(转换为一维)而不会影响批次大小。
      • Dense : 此方法用于创建一个全连接神经网络。
    • 模型
      • Sequential: Sequential 模型 是层的线性堆栈。我们可以通过将层实例列表传递给构造函数来创建 Sequential 模型,或者简单地通过使用 **.add()** 方法添加层。
import numpy as np
from keras.layers import Conv2D, Activation, MaxPool2D, Flatten, Dense
from keras.models import Sequential

卷积神经网络架构

典型的卷积神经网络由以下部分组成

1. 隐藏层/特征提取部分

(卷积层和池化层)

在网络的这部分,进行一系列卷积和池化操作,用于检测输入图像的不同特征。

  • 在下面的代码片段中,我们初始化了一个 **Sequential 模型**。首先,我们没有向构造函数提供层,而是通过使用 **.add()** 函数将层添加到模型中,如以下代码片段所示。
  • 此模型允许我们通过一次添加一层来构建模型。
# Images fed into this model are 28 x 28 pixels with 3 channels
img_shape = (28,28,1)
# Set up the model
model = Sequential()
  • 在上面的代码行中,首先我们定义了 **img_shape**。这里的 **img_shape** 表示模型的输入图像形状。这很重要,因为模型需要知道它应该期望接收什么样的输入形状。**因此,Sequential 模型中的第一层始终需要接收有关输入形状的信息。**
# Add convolutional layer with 3, 3 by 3 filters
model.add(Conv2D(3,kernel_size=3,input_shape=img_shape))

Conv2D()

此函数创建一个 2D 卷积层。

卷积层的作用是什么?

  • 卷积层使用滤波器扫描源图像以提取可能对分类重要的特征。(滤波器也称为卷积核。)
  • 内核还包含权重,这些权重在模型训练期间进行调整以实现最准确的预测。
  • 参数 :
    • 第一个参数(本例中为 3)定义了每层中神经节点的数量。
    • kernel_size 定义了滤波器大小 - 这是模型用来扫描图像的像素平方区域。内核大小为 3 表示模型一次查看 3×3 像素的正方形。
    • input_shape 是图像的像素大小。

MaxPool2D()

此函数创建一个 2D 池化层。

池化层的作用是什么?

  • 它的作用是 **逐步减少表示的时空大小以减少网络中的参数和计算量**,因此也 **控制过拟合**。
  • 池化层 **独立地对输入的每个深度切片进行操作并使用 MAX 操作在空间上调整其大小**。

Activation()

**Activation()** 方法用于将特定的激活函数应用于池化层的输出。这里将 **relu** 激活函数添加到输出层。

# Add relu activation to the layer 
model.add(Activation('relu'))
#Pooling
model.add(MaxPool2D(2))

2. 分类层

在卷积层之后,有一个分类层,它包含全连接层,其中一层中的神经元与前一层中的每个激活连接。全连接层的问题是它只能接收 **一维** 数据。由于前一层的输出数据是 **三维** 的,因此在将其馈送到分类层之前需要将其 **展平**(转换为一维)。

为此,添加了一个 **Flatten** 层,它获取卷积层的输出并将其转换为全连接神经层可以使用的一种格式。

#Fully connected layers
# Use Flatten to convert 3D data to 1D
model.add(Flatten())
# Add dense layer with 10 neurons
model.add(Dense(10))
# we use the softmax activation function for our last layer
model.add(Activation('softmax'))

Dense()

模型的最后一层是 'Dense' 类型,一个全连接神经层,它将给出最终的分类/预测。

  • 参数(本例中为 10)指的是 **输出节点的数量**。
  • 之后,将 **softmax** 激活函数添加到输出层。它将获取 Dense 层的输出并将其转换为有意义的概率,最终有助于做出最终预测。

Summary()

现在,为了概述我们创建的模型,我们将使用 **summary() 函数**。这将列出有关层、输出形状和参数数量的详细信息。

# give an overview of our model
model.summary()

编译

在训练过程之前,我们必须以特定的形式将学习过程组合在一起。这通过 compile() 完成。它包含 3 个元素

  • 优化器 : 现有优化器的字符串标识符/优化器函数的调用
  • 损失函数 : 现有损失函数的字符串标识符/损失函数的调用
  • 指标列表 : 现有指标的字符串标识符或指标函数的调用
    • 指标定义了如何评估模型的成功。
    • 我们将使用 'accuracy' 指标来计算测试/验证图像集的准确性得分。
model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy'])

我们的模型已经完成。

训练阶段

现在我们将训练模型。

我们将使用 MNIST 数据集,这是一个深度学习基准数据集,包含 70,000 个从 0-9 的手写数字。

#importing the datasets
from keras.datasets import mnist
#loading the datasets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
import matplotlib.pyplot as plt
plt.imshow(X_train[0])
print(X_train[0].shape)

在下面的代码片段中,进行了重塑。

但为什么?

对两个图像集 X_train 和 X_test 进行重塑是为了使它们的形状与我们的 CNN 模型期望的形状匹配。

X_train = X_train.reshape(60000,28,28,1)
X_test = X_test.reshape(10000,28,28,1)

独热编码

在独热编码中,我们为每个分类类别创建一个列,每个列包含二进制值,指示当前图像是否属于该类别。

from keras.utils import to_categorical
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

fit( )

  • 此方法训练模型固定数量的 epochs。
  • 一个 epoch 是对提供的整个 x 和 y 数据进行一次迭代。
model.fit(X_train, y_train,validation_data=(X_test,y_test) ,epochs=10, verbose=2)

现在该测试我们的模型了!!

我们可以使用 **predict( )** 函数对特定输入进行测试,如以下代码片段所示。

  • 输出格式 输出将包含 10 个概率,每个概率代表是 0-9 中特定数字的概率。
model.predict(X_test[:4])

改进结果的显示

通过直接显示预测的数字而不是概率。(即只是细化输出。)

def return_Digit(x):
    a=np.where(x==max(x))
    print(int(a[0]))
prediction=model.predict(X_test[:4])
np.apply_along_axis(return_Digit,axis=1,arr=prediction)

---------------------------------------------------------------------------