TF Learn : 基于Scikit-learn和TensorFlow的深度学习利器

89367464

2018-09-06

Scikit-learn 是最常用的 Python 机器学习框架，在各大互联网公司做算法的工程师在实现单机版本的算法的时候或多或少都会用到 Scikit-learn 。TensorFlow 就更是大名鼎鼎，做深度学习的人都不可能不知道 TensorFlow。

下面我们先来看一段样例，这段样例是传统的机器学习算法逻辑回归的实现：

TF Learn : 基于Scikit-learn和TensorFlow的深度学习利器

可以看到，样例中仅仅使用了 3 行代码就完成了逻辑回归的主要功能。下面我们来看一下如果用 TensorFlow 来实现同样的代码，需要多少行？下面的代码来自 GitHub :

'''  




A logistic regression learning algorithm example using TensorFlow library.  




This example is using the MNIST database of handwritten digits  




(http://yann.lecun.com/exdb/mnist/)  




Author: Aymeric Damien  




Project: https://github.com/aymericdamien/TensorFlow-Examples/  




'''  




from __future__ import print_function  




import tensorflow as tf  




# Import MNIST data  




from tensorflow.examples.tutorials.mnist import input_data  




mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)  



 



# Parameters  




learning_rate = 0.01  




training_epochs = 25  




batch_size = 100  




display_step = 1  



 



# tf Graph Input  




x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784  




y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes  



 



# Set model weights  




W = tf.Variable(tf.zeros([784, 10]))  




b = tf.Variable(tf.zeros([10]))  



 



# Construct model  




pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax  



 



# Minimize error using cross entropy  




cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))  




# Gradient Descent  




optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)  



 



# Initialize the variables (i.e. assign their default value)  




init = tf.global_variables_initializer()  



 



# Start training  




with tf.Session() as sess:  




    # Run the initializer  




    sess.run(init)  




    # Training cycle  




    for epoch in range(training_epochs):  




        avg_cost = 0.  




        total_batch = int(mnist.train.num_examples/batch_size)  




        # Loop over all batches  




        for i in range(total_batch):  




            batch_xs, batch_ys = mnist.train.next_batch(batch_size)  




            # Run optimization op (backprop) and cost op (to get loss value)  




            _, c = sess.run([optimizer, cost], feed_dict={x: batch_xs,  




                                                          y: batch_ys})  




            # Compute average loss  




            avg_cost += c / total_batch  




        # Display logs per epoch step  




        if (epoch+1) % display_step == 0:  




            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost))  



 



    print("Optimization Finished!")  



 



    # Test model  




    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))  




    # Calculate accuracy  




    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))  



print("Accuracy:", accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))

一个相对来说比较简单的机器学习算法，用 Tensorflow 来实现却花费了大量的篇幅。然而 Scikit-learn 本身没有 Tensorflow 那样丰富的深度学习的功能。有没有什么办法，能够在保证 Scikit-learn 的简单易用性的前提下，能够让 Scikit-learn 像 Tensorflow 那样支持深度学习呢？答案是有的，那就是 Scikit-Flow 开源项目。该项目后来被集成到了 Tensorflow 项目里，变成了现在的 TF Learn 模块。

我们来看一个 TF Learn 实现线性回归的样例：

""" Linear Regression Example """ 



from __future__ import absolute_import, division, print_function  




import tflearn  




# Regression data  




X = [3.3,4.4,5.5,6.71,6.93,4.168,9.779,6.182,7.59,2.167,7.042,10.791,5.313,7.997,5.654,9.27,3.1]  




Y = [1.7,2.76,2.09,3.19,1.694,1.573,3.366,2.596,2.53,1.221,2.827,3.465,1.65,2.904,2.42,2.94,1.3]  




# Linear Regression graph  




input_ = tflearn.input_data(shape=[None])  




linear = tflearn.single_unit(input_)  




regression = tflearn.regression(linear, optimizer='sgd', loss='mean_square',  




                                metric='R2', learning_rate=0.01)  




m = tflearn.DNN(regression)  




m.fit(X, Y, n_epoch=1000, show_metric=True, snapshot_epoch=False)  




print("\nRegression result:")  




print("Y = " + str(m.get_weights(linear.W)) +  




      "*X + " + str(m.get_weights(linear.b)))  



 



print("\nTest prediction for x = 3.2, 3.3, 3.4:")  



print(m.predict([3.2, 3.3, 3.4]))

我们可以看到，TF Learn 继承了 Scikit-Learn 的简洁编程风格，在处理传统的机器学习方法的时候非常的方便。下面我们看一段 TF Learn 实现 CNN （MNIST数据集）的样例：

""" Convolutional Neural Network for MNIST dataset classification task.  




References:  




    Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. "Gradient-based  




    learning applied to document recognition." Proceedings of the IEEE,  




    86(11):2278-2324, November 1998.  




Links:  




    [MNIST Dataset] http://yann.lecun.com/exdb/mnist/  




"""  



 



from __future__ import division, print_function, absolute_import   




import tflearn  




from tflearn.layers.core import input_data, dropout, fully_connected  




from tflearn.layers.conv import conv_2d, max_pool_2d  




from tflearn.layers.normalization import local_response_normalization  




from tflearn.layers.estimator import regression  



 



# Data loading and preprocessing  




import tflearn.datasets.mnist as mnist  




X, Y, testX, testY = mnist.load_data(one_hot=True)  




X = X.reshape([-1, 28, 28, 1])  




testX = testX.reshape([-1, 28, 28, 1])  




# Building convolutional network  




network = input_data(shape=[None, 28, 28, 1], name='input')  




network = conv_2d(network, 32, 3, activation='relu', regularizer="L2")  




network = max_pool_2d(network, 2)  




network = local_response_normalization(network)  




network = conv_2d(network, 64, 3, activation='relu', regularizer="L2")  




network = max_pool_2d(network, 2)  




network = local_response_normalization(network)  




network = fully_connected(network, 128, activation='tanh')  




network = dropout(network, 0.8)  




network = fully_connected(network, 256, activation='tanh')  




network = dropout(network, 0.8)  



network = fully_connected(network, 10, activation='softmax') 



network = regression(network, optimizer='adam', learning_rate=0.01,  




                     loss='categorical_crossentropy', name='target')  



 



# Training  




model = tflearn.DNN(network, tensorboard_verbose=0)  




model.fit({'input': X}, {'target': Y}, n_epoch=20,  




           validation_set=({'input': testX}, {'target': testY}),  



snapshot_step=100, show_metric=True, run_id='convnet_mnist')

可以看到，基于 TF Learn 的深度学习代码也是非常的简洁。

TF Learn 是 TensorFlow 的高层次类 Scikit-Learn 封装，提供了原生版 TensorFlow 和 Scikit-Learn 之外的又一种选择。对于熟悉了 Scikit-Learn 和厌倦了 TensorFlow 冗长代码的用户来说，不啻为一种福音，也值得机器学习和数据挖掘的从业者认真学习和掌握。

机器学习 tensorflow scikit-learn 深度学习

安科网

TF Learn : 基于Scikit-learn和TensorFlow的深度学习利器

89367464

89367464

相关推荐

TensorFlow为新旧Mac特供新版本，速度最高提升7倍

微软和谷歌分别开源分布式深度学习框架，各自厉害在哪？

自动驾驶汽车深度学习如何应对挑战?

不要上手就学深度学习！超详细的人工智能专家路线图，GitHub数天获2.1k星

DJL 如何正确打开 [ 深度学习 ]

揭开AI、机器学习和深度学习的神秘面纱

用 Java 训练深度学习模型，原来可以这么简单！

面向深度学习的五大神经网络模型及其应用

如何通过7个步骤构建机器学习模型

机器学习新风暴：如何用ML模型预测房价？

关于机器学习管道需要了解什么?

为什么所有的机器学习模型有90％从没有投入生产

LinkedIn开源Dagli，发布Java机器学习函数库

关于机器学习算法的16个技巧

全面解读谷歌云人工智能如何为机器学习提供帮助

关于感知器的故事：机器学习是如何发展到如今这一程度的呢？

需要知识的后深度学习时代，如何高效自动构建知识图谱

10个丰富自我的机器学习项目

机器学习如何颠覆金融行业

24个提高知识和技能极限的机器学习项目

89367464