Keras实施MnasNet

让我们探索一种名为MnasNet的移动平台感知神经网络架构搜索算法 。MnasNet由Google Brain团队开发。在这里,我们将回顾论文的主要贡献,应用的方法,最后,快速实现Keras中的最终模型。在此之前,让我们了解这种模型背后的动机。

动机

为大型数据集设计,训练和评估卷积神经网络是一项艰巨的任务,因为它耗时且需要广泛的领域知识。为了解决设计卷积神经网络(CNN)模型的问题,Google Brain团队设计了一个名为NasNet(神经架构搜索网络)的模型,该模型搜索可能的卷积,池化以及具有可变步幅,核大小等的块的搜索空间。但是,该模型没有可以在移动平台上运行搜索的高效模型。因此,开发了MnasNet。

主要贡献

作者在评估模型时引入延迟信息,以阻止昂贵操作的大型模型。这导致准确性和延迟之间的良好折衷。

在ImageNet分类任务中,MnasNet模型在Pixel手机上实现了74.0%的前1精度,延迟为76ms。

在COCO对象检测任务中,MnasNet实现了比MobileNets更高的mAP质量和更低的延迟。

分解的分层搜索空间

作者使用了分解的分层搜索空间。这意味着他们将层预打包成块,然后使用可变超参数搜索这些块。可以在原始论文中找到对此的可视化。

模型架构

让我们直接进入使用他们的方法找到的模型架构。架构如下:

Keras实施MnasNet

图1:MnasNet架构 - (a)是主要模型; (b) - (f)是相应的块

除了一个之外的每个块具有相同的结构。结构如下:

Conv2D(1x1) - > BatchNormalization - > ReLU6 - > DepthwiseConv2D - > BatchNormalization - > ReLU6 - > Conv2D(1x1) - > BatchNormalization - > ReLU

根据结构的不同,块可能会从最后一层的输入跳转到输出,也可能不会。SepConv层只有DepthwiseConv2D,Conv2D(1x1),BatchNormalization,最后是ReLU6激活层。

我们首先定义最初的Conv3x3块,Python代码如下:

def _conv_block(inputs, strides, filters, kernel=3):

"""

Adds an initial convolution layer (with batch normalization and relu6).

"""

x = layers.Conv2D(filters, kernel, padding='same', use_bias=False, strides=strides, name='Conv1')(inputs)

x = layers.BatchNormalization(epsilon=1e-3, momentum=0.999, name='Conv1_bn')(x)

print(x.name, inputs.shape, x.shape)

return layers.ReLU(6., name='Conv1_relu')(x)

然后,我们可以将SepConv3x3块定义为:

def _sep_conv_block(inputs, filters, alpha, pointwise_conv_filters, depth_multiplier=1, strides=(1, 1)):

"""

Adds a separable convolution block.

A separable convolution block consists of a depthwise conv,

and a pointwise convolution.

"""

pointwise_conv_filters = int(pointwise_conv_filters * alpha)

x = layers.DepthwiseConv2D((3, 3),

padding='same',

depth_multiplier=depth_multiplier,

strides=strides,

use_bias=False,

name='Dw_conv_sep')(inputs)

x = layers.Conv2D(pointwise_conv_filters, (1, 1), padding='valid', use_bias=False,

strides=strides, name='Conv_sep')(x)

x = layers.BatchNormalization(epsilon=1e-3, momentum=0.999, name='Conv_sep_bn')(x)

print(x.name, inputs.shape, x.shape)

return layers.ReLU(6., name='Conv_sep_relu')(x)

最后,我们定义了深度卷积块,也称为反向残差块(来自MobileNet文献),Python代码如下:

def _inverted_res_block(inputs, kernel, expansion, alpha, filters, block_id, stride=1):

in_channels = inputs._keras_shape[-1]

pointwise_conv_filters = int(filters * alpha)

pointwise_filters = _make_divisible(pointwise_conv_filters, 8)

x = inputs

prefix = 'block_{}_'.format(block_id)

if block_id:

x = layers.Conv2D(expansion * in_channels,

kernel_size=1,

padding='same',

use_bias=False,

activation=None,

name=prefix + 'expand')(x)

x = layers.BatchNormalization(epsilon=1e-3,

momentum=0.999,

name=prefix + 'expand_bn')(x)

x = layers.ReLU(6., name=prefix + 'expand_relu')(x)

else:

prefix = 'expanded_conv_'

x = layers.DepthwiseConv2D(kernel_size=kernel,

strides=stride,

activation=None,

use_bias=False,

padding='same',

name=prefix + 'depthwise')(x)

x = layers.BatchNormalization(epsilon=1e-3,

momentum=0.999,

name=prefix + 'depthwise_bn')(x)

x = layers.ReLU(6., name=prefix + 'depthwise_relu')(x)

x = layers.Conv2D(pointwise_filters,

kernel_size=1,

padding='same',

use_bias=False,

activation=None,

name=prefix + 'project')(x)

x = layers.BatchNormalization(

epsilon=1e-3, momentum=0.999, name=prefix + 'project_bn')(x)

print(x.name, inputs.shape, x.shape)

if in_channels == pointwise_filters and stride == 1:

print("Adding %s" % x.name)

return layers.Add(name=prefix + 'add')([inputs, x])

return x

现在,我们可以通过指定核大小,步幅大小,过滤器数量,skip_id与否等来构建基于论文规范的架构:

def MnasNet(input_shape=None, alpha=1.0, depth_multiplier=1, pooling=None, nb_classes=10):

img_input = layers.Input(shape=input_shape)

first_block_filters = _make_divisible(32 * alpha, 8)

x = _conv_block(img_input, strides=2, filters=first_block_filters)

x = _sep_conv_block(x, filters=16, alpha=alpha,

pointwise_conv_filters=16, depth_multiplier=depth_multiplier)

x = _inverted_res_block(x, kernel=3, expansion=3, stride=2, alpha=alpha, filters=24, block_id=1)

x = _inverted_res_block(x, kernel=3, expansion=3, stride=1, alpha=alpha, filters=24, block_id=2)

x = _inverted_res_block(x, kernel=3, expansion=3, stride=1, alpha=alpha, filters=24, block_id=3)

x = _inverted_res_block(x, kernel=5, expansion=3, stride=2, alpha=alpha, filters=40, block_id=4)

x = _inverted_res_block(x, kernel=5, expansion=3, stride=1, alpha=alpha, filters=40, block_id=5)

x = _inverted_res_block(x, kernel=5, expansion=3, stride=1, alpha=alpha, filters=40, block_id=6)

x = _inverted_res_block(x, kernel=5, expansion=6, stride=2, alpha=alpha, filters=80, block_id=7)

x = _inverted_res_block(x, kernel=5, expansion=6, stride=1, alpha=alpha, filters=80, block_id=8)

x = _inverted_res_block(x, kernel=5, expansion=6, stride=1, alpha=alpha, filters=80, block_id=9)

x = _inverted_res_block(x, kernel=3, expansion=6, stride=1, alpha=alpha, filters=96, block_id=10)

x = _inverted_res_block(x, kernel=3, expansion=6, stride=1, alpha=alpha, filters=96, block_id=11)

x = _inverted_res_block(x, kernel=5, expansion=6, stride=2, alpha=alpha, filters=192, block_id=12)

x = _inverted_res_block(x, kernel=5, expansion=6, stride=1, alpha=alpha, filters=192, block_id=13)

x = _inverted_res_block(x, kernel=5, expansion=6, stride=1, alpha=alpha, filters=192, block_id=14)

x = _inverted_res_block(x, kernel=5, expansion=6, stride=1, alpha=alpha, filters=192, block_id=15)

x = _inverted_res_block(x, kernel=3, expansion=6, stride=1, alpha=alpha, filters=320, block_id=16)

if pooling == 'avg':

x = layers.GlobalAveragePooling2D()(x)

else:

x = layers.GlobalMaxPooling2D()(x)

x = layers.Dense(nb_classes, activation='softmax', use_bias=True, name='proba')(x)

inputs = img_input

model = models.Model(inputs, x, name='mnasnet')

return model

训练和验证

我们可以在CIFAR 10数据集上试用我们的模型,而不是完整的ImageNet数据集:

input_shape = (32, 32)

batch_size = 2048

nb_classes = 10

epochs = 100

model = MnasNet(input_shape=input_shape+(3,), pooling='avg', nb_classes=nb_classes)

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

(x_train, y_train), (x_test, y_test) = datasets.cifar10.load_data()

y_train = utils.to_categorical(y_train, nb_classes)

y_test = utils.to_categorical(y_test, nb_classes)

x_train = x_train.astype('float32')

x_test = x_test.astype('float32')

x_train /= 255

x_test /= 255

model.fit(x_train, y_train,

batch_size=batch_size,

epochs=epochs,

validation_data=(x_test, y_test),

shuffle=True)

结论

总之,该方法是一种在没有太多领域知识的情况下设计卷积神经网络(CNN)的简单方法。结合Pixel手机的实时延迟信息也有助于优化架构的延迟和准确性。

相关推荐