用Keras生成面部Python实现

训练过程
可采用的机器学习数据集:
- https://www.kaggle.com/gasgallo/faces-data-new
- https://www.kaggle.com/gasgallo/lag-dataset
两者都包含人脸图像。我把这两个组合成一个文件夹。
为任务选择正确的网络
最常听到的两种图像生成技术是生成对抗网络(GAN)和LSTM网络。
LSTM训练的时候速度非常慢,GAN训练会快得多。实际结果花不到半小时,模糊的面孔就会开始出现。随着时间的推移,图像会更加逼真。
有许多GAN变种。我使用的一种称为深度卷积神经网络(DCGAN)。DCGAN的优点在于它使用了卷积层。卷积神经网络目前是存在的最佳图像分类算法。
简介
生成对抗网络是由一位名叫Ian Goodfellow的研究员发明的,并于2014年引入了GAN。
GAN非常强大。利用正确的数据,网络架构和超参数,您可以生成非常逼真的图像。
将来,一些高级版本的GAN或其他一些内容生成算法可能会让我们做一些很酷的事情:
- 生成逼真的视频游戏。
- 生成电影。
- 为新技术(更好的汽车,宇宙飞船等)生成3D设计
但GAN是如何运作的呢?

GAN实际上不是一个神经网络,而是两个。其中之一是Generator。它将随机值作为输入并生成图像。
第二是discriminator。它试图确定图像是假的还是真的。
训练GAN就像一场竞赛。Generator试图在愚弄discriminator时变得尽可能好。discriminator试图尽可能地将假图像与真实图像分开。
这将迫使他们两个都改善。理想情况下,这将在某种程度上导致以下情况:
- Generator生成的图像对于人类来说与真实图像无法区分。
- discriminator网络的准确率达到50%。换句话说,discriminator不能分离真的和假的,因此每次都必须猜测。
在现实中,您需要确保一切正常(数据、体系结构、超参数)。GAN对超参数值的微小变化非常敏感。
神经网络架构

Python实现
导入库
第一步是导入所有需要的Python库。
#Import everything that is needed from Keras library. from keras.layers import Input, Reshape, Dropout, Dense, Flatten, BatchNormalization, Activation, ZeroPadding2D from keras.layers.advanced_activations import LeakyReLU from keras.layers.convolutional import UpSampling2D, Conv2D from keras.models import Sequential, Model, load_model from keras.optimizers import Adam #matplotlib will help with displaying the results import matplotlib.pyplot as plt #numpy for some mathematical operations import numpy as np #PIL for opening,resizing and saving images from PIL import Image #tqdm for a progress bar when loading the dataset from tqdm import tqdm #os library is needed for extracting filenames from the dataset folder. import os

FaceGenerator类
class FaceGenerator: #RGB-images: 3-channels, grayscale: 1-channel, RGBA-images: 4-channels def __init__(self,image_width,image_height,channels): self.image_width = image_width self.image_height = image_height self.channels = channels self.image_shape = (self.image_width,self.image_height,self.channels) #Amount of randomly generated numbers for the first layer of the generator. self.random_noise_dimension = 100 #Just 10 times higher learning rate would result in generator loss being stuck at 0. optimizer = Adam(0.0002,0.5) self.discriminator = self.build_discriminator() self.discriminator.compile(loss="binary_crossentropy",optimizer=optimizer,metrics=["accuracy"]) self.generator = self.build_generator() #A placeholder for the generator input. random_input = Input(shape=(self.random_noise_dimension,)) #Generator generates images from random noise. generated_image = self.generator(random_input) # For the combined model we will only train the generator self.discriminator.trainable = False #Discriminator attempts to determine if image is real or generated validity = self.discriminator(generated_image) #Combined model = generator and discriminator combined. #1. Takes random noise as an input. #2. Generates an image. #3. Attempts to determine if image is real or generated. self.combined = Model(random_input,validity) self.combined.compile(loss="binary_crossentropy",optimizer=optimizer)

这段Python代码初始化了训练所需的一些重要变量。
- image_width,simage_height =生成图像的大小(以像素为单位)
- channels =生成的图像中的颜色通道数量
- random_noise_dimension =generator作为输入的随机值的数量
- optimizer= 用于反向传播的优化器
- discriminator =一种卷积神经网络,试图确定图像是假的还是真的
- generator =生成图像的卷积神经网络。
- random_input =随机值的占位符。我们将使用它将随机值提供给generator。
- generated_image =generator的输出
- validity=generator在多大程度上欺骗discriminator
- combined=generator和discriminator组合成一个模型。它不是单独训练generator器,而是通过组合模型进行训练。这是为了反向传播损失所必需的。
将训练数据加载到模型中
def get_training_data(self,datafolder):
print("Loading training data...")
training_data = []
#Finds all files in datafolder
filenames = os.listdir(datafolder)
for filename in tqdm(filenames):
#Combines folder name and file name.
path = os.path.join(datafolder,filename)
#Opens an image as an Image object.
image = Image.open(path)
#Resizes to a desired size.
image = image.resize((self.image_width,self.image_height),Image.ANTIALIAS)
#Creates an array of pixel values from the image.
pixel_array = np.asarray(image)
training_data.append(pixel_array)
#training_data is converted to a numpy array
training_data = np.reshape(training_data,(-1,self.image_width,self.image_height,self.channels))
return training_data
此函数将文件夹的名称作为输入,并将该文件夹中的所有图像作为numpy数组返回。所有图像的大小都调整为__init__函数中指定的大小。
Shape=(图像的数量,宽度,高度,通道)。
神经网络
def build_generator(self):
#Generator attempts to fool discriminator by generating new images.
model = Sequential()
model.add(Dense(256*4*4,activation="relu",input_dim=self.random_noise_dimension))
model.add(Reshape((4,4,256)))
#Four layers of upsampling, convolution, batch normalization and activation.
# 1. Upsampling: Input data is repeated. Default is (2,2). In that case a 4x4x256 array becomes an 8x8x256 array.
# 2. Convolution
# 3. Normalization normalizes outputs from convolution.
# 4. Relu activation: f(x) = max(0,x). If x < 0, then f(x) = 0.
model.add(UpSampling2D())
model.add(Conv2D(256,kernel_size=3,padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))
model.add(UpSampling2D())
model.add(Conv2D(256,kernel_size=3,padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))
model.add(UpSampling2D())
model.add(Conv2D(128,kernel_size=3,padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))
model.add(UpSampling2D())
model.add(Conv2D(128,kernel_size=3,padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))
# Last convolutional layer outputs as many featuremaps as channels in the final image.
model.add(Conv2D(self.channels,kernel_size=3,padding="same"))
# tanh maps everything to a range between -1 and 1.
model.add(Activation("tanh"))
# show the summary of the model architecture
model.summary()
# Placeholder for the random noise input
input = Input(shape=(self.random_noise_dimension,))
#Model output
generated_image = model(input)
#Change the model type from Sequential to Model (functional API) More at: https://keras.io/models/model/.
return Model(input,generated_image)
def build_discriminator(self):
#Discriminator attempts to classify real and generated images
model = Sequential()
model.add(Conv2D(32, kernel_size=3, strides=2, input_shape=self.image_shape, padding="same"))
#Leaky relu is similar to usual relu. If x < 0 then f(x) = x * alpha, otherwise f(x) = x.
model.add(LeakyReLU(alpha=0.2))
#Dropout blocks some connections randomly. This help the model to generalize better.
#0.25 means that every connection has a 25% chance of being blocked.
model.add(Dropout(0.25))
model.add(Conv2D(64, kernel_size=3, strides=2, padding="same"))
#Zero padding adds additional rows and columns to the image. Those rows and columns are made of zeros.
model.add(ZeroPadding2D(padding=((0,1),(0,1))))
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))
model.add(Conv2D(128, kernel_size=3, strides=2, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))
model.add(Conv2D(256, kernel_size=3, strides=1, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))
model.add(Conv2D(512, kernel_size=3, strides=1, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))
#Flatten layer flattens the output of the previous layer to a single dimension.
model.add(Flatten())
#Outputs a value between 0 and 1 that predicts whether image is real or generated. 0 = generated, 1 = real.
model.add(Dense(1, activation='sigmoid'))
model.summary()
input_image = Input(shape=self.image_shape)
#Model output given an image.
validity = model(input_image)
return Model(input_image, validity)

这两个函数定义了generator和discriminator。
神经网络模型训练
def train(self, datafolder ,epochs,batch_size,save_images_interval):
#Get the real images
training_data = self.get_training_data(datafolder)
#Map all values to a range between -1 and 1.
training_data = training_data / 127.5 - 1.
#Two arrays of labels. Labels for real images: [1,1,1 ... 1,1,1], labels for generated images: [0,0,0 ... 0,0,0]
labels_for_real_images = np.ones((batch_size,1))
labels_for_generated_images = np.zeros((batch_size,1))
for epoch in range(epochs):
# Select a random half of images
indices = np.random.randint(0,training_data.shape[0],batch_size)
real_images = training_data[indices]
#Generate random noise for a whole batch.
random_noise = np.random.normal(0,1,(batch_size,self.random_noise_dimension))
#Generate a batch of new images.
generated_images = self.generator.predict(random_noise)
#Train the discriminator on real images.
discriminator_loss_real = self.discriminator.train_on_batch(real_images,labels_for_real_images)
#Train the discriminator on generated images.
discriminator_loss_generated = self.discriminator.train_on_batch(generated_images,labels_for_generated_images)
#Calculate the average discriminator loss.
discriminator_loss = 0.5 * np.add(discriminator_loss_real,discriminator_loss_generated)
#Train the generator using the combined model. Generator tries to trick discriminator into mistaking generated images as real.
generator_loss = self.combined.train_on_batch(random_noise,labels_for_real_images)
print ("%d [Discriminator loss: %f, acc.: %.2f%%] [Generator loss: %f]" % (epoch, discriminator_loss[0], 100*discriminator_loss[1], generator_loss))
if epoch % save_images_interval == 0:
self.save_images(epoch)
#Save the model for a later use
self.generator.save("saved_models/facegenerator.h5")
对于每个epoch:
- 随机选择要在此epoch使用的一半真实图像。
- 创建一个介于0和1之间的随机数数组。这将是generator的输入。Shape =(batch_size,self.random_noise_dimension)
- 生成新的图像。生成的图像数量等于batch size。
- 训练discriminator辨别真伪图像。
- 计算discriminator的平均损失。
- 使用组合模型训练generator。
- 打印损失值。
- 如果 epochs数等于下一个间隔,则生成图像并保存它们。
训练结束后:
- 保存训练好的模型以供日后使用。
显示结果
def save_images(self,epoch):
#Save 25 generated images for demonstration purposes using matplotlib.pyplot.
rows, columns = 5, 5
noise = np.random.normal(0, 1, (rows * columns, self.random_noise_dimension))
generated_images = self.generator.predict(noise)
generated_images = 0.5 * generated_images + 0.5
figure, axis = plt.subplots(rows, columns)
image_count = 0
for row in range(rows):
for column in range(columns):
axis[row,column].imshow(generated_images[image_count, :], cmap='spring')
axis[row,column].axis('off')
image_count += 1
figure.savefig("generated_images/generated_%d.png" % epoch)
plt.close()
此函数可用于在训练后生成新图像。
其他
def generate_single_image(self,model_path,image_save_path):
noise = np.random.normal(0,1,(1,self.random_noise_dimension))
model = load_model(model_path)
generated_image = model.predict(noise)
#Normalized (-1 to 1) pixel values to the real (0 to 256) pixel values.
generated_image = (generated_image+1)*127.5
print(generated_image)
#Drop the batch dimension. From (1,w,h,c) to (w,h,c)
generated_image = np.reshape(generated_image,self.image_shape)
image = Image.fromarray(generated_image,"RGB")
image.save(image_save_path)
if __name__ == '__main__':
facegenerator = FaceGenerator(64,64,3)
facegenerator.train(datafolder="data",epochs=4000, batch_size=32, save_images_interval=100)
facegenerator.generate_single_image("saved_models/facegenerator.h5","test.png")
结论
训练GAN很难,当你成功时,这种感觉会非常有益。
此Python代码可以轻松用于其他图像数据集。请记住,您可能需要编辑网络体系结构和参数,具体取决于您尝试生成的图像。