卷积神经网络(Convolutional Neural Network, CNN):一种前馈神经网络,在图像处理里面很常用。
下面的图说明了设置了两个滤波器,分别和输入的数据做卷积:
卷积计算:
卷积神经网络计算的特点:
ReLU
函数:收敛快,求梯度简单,较常用。sigmoid
函数:容易饱和、造成终止梯度传递,且没有0中心化,在CNN中不太常用。这里以AlexNet模型的第一个卷积层为例:
LeNet5:美国银行里手写字体数字识别的高校的卷积神经网络模型,涵盖了深度学习的基本模块:卷积层,池化层,全链接层,可把这个作为例子深入解析:
对这个网络里每一层的解释在网络解析(一):LeNet-5详解有详细的说明,具体可参考。这里只对第一层做一下解读,其他的类似。
C1是卷积层,
Keras实现LeNet5网络:
# https://github.com/TaavishThaman/LeNet-5-with-Keras/blob/master/lenet_5.py
model = keras.Sequential()
model.add(layers.Conv2D(filters=6, kernel_size=(5, 5), strides = 1, activation='relu', input_shape=(32,32,1)))
model.add(layers.AveragePooling2D())
model.add(layers.Conv2D(filters=16, kernel_size=(5, 5), strides = 1, activation='relu'))
model.add(layers.AveragePooling2D())
model.add(layers.Flatten())
model.add(layers.Dense(units=120, activation='relu'))
model.add(layers.Dense(units=84, activation='relu'))
model.add(layers.Dense(units=10, activation = 'softmax'))
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
model.fit(X_train ,Y_train, steps_per_epoch = 10, epochs = 42)
pytorch实现LeNet5网络:
# https://github.com/activatedgeek/LeNet-5/blob/master/lenet.py
import torch.nn as nn
from collections import OrderedDict
class LeNet5(nn.Module):
"""
Input - 1x32x32
C1 - 6@28x28 (5x5 kernel)
tanh
S2 - 6@14x14 (2x2 kernel, stride 2) Subsampling
C3 - 16@10x10 (5x5 kernel, complicated shit)
tanh
S4 - 16@5x5 (2x2 kernel, stride 2) Subsampling
C5 - 120@1x1 (5x5 kernel)
F6 - 84
tanh
F7 - 10 (Output)
"""
def __init__(self):
super(LeNet5, self).__init__()
self.convnet = nn.Sequential(OrderedDict([
('c1', nn.Conv2d(1, 6, kernel_size=(5, 5))),
('relu1', nn.ReLU()),
('s2', nn.MaxPool2d(kernel_size=(2, 2), stride=2)),
('c3', nn.Conv2d(6, 16, kernel_size=(5, 5))),
('relu3', nn.ReLU()),
('s4', nn.MaxPool2d(kernel_size=(2, 2), stride=2)),
('c5', nn.Conv2d(16, 120, kernel_size=(5, 5))),
('relu5', nn.ReLU())
]))
self.fc = nn.Sequential(OrderedDict([
('f6', nn.Linear(120, 84)),
('relu6', nn.ReLU()),
('f7', nn.Linear(84, 10)),
('sig7', nn.LogSoftmax(dim=-1))
]))
def forward(self, img):
output = self.convnet(img)
output = output.view(img.size(0), -1)
output = self.fc(output)
return output