走进机器学习世界之TensorFlow.js快速上手

HappinessSourceL

2019-06-28

前言

近两年人工智能，机器学习等各种概念漫天飞舞，那人工智能，机器学习，深度学习这些名词之间是什么关系呢？

如果用三个同心圆来解释的话，人工智能是最大的圆，机器学习是中间的圆，深度学习是最小的圆。具体解释就是：

机器学习是实现人工智能的一种手段
深度学习是实现机器学习的一种技术

今天我们要介绍的TensorFlow.js是由Google的AI团队发布一款机器学习框架，基于DeepLearn.js(已经停止更新)。这款机器学习框架的特点是使用JavaScript语言，在浏览器中就可以使用它提供的各种API来进行建模和训练，并且支持Node.js。所以对于前端来说，是走进机器学习世界最便捷的路径了。

这里有一个利用TensorFlow.js实现的机器学习的小游戏demo，大家可以感受一下。尝试一下

这篇文章基于TensorFlow.js的英文官方文档写成，重点在于TensorFlow.js的入门，关于机器学习更多的知识点可参考Google机器学习课程。

让我们开始吧！

安装

直接引入

第一种方式是通过<script></script>直接引入，在浏览器中运行下面的代码，在控制台中可以看到结果。

<html>
  <head>
    <!-- Load TensorFlow.js -->
    <script src="https://cdn.jsdelivr.net/npm/@tensorflow/[email protected]"> </script>

    <!-- Place your code in the script tag below. You can also use an external .js file -->
    <script>
      // Notice there is no 'import' statement. 'tf' is available on the index-page
      // because of the script tag above.

      // Define a model for linear regression.
      const model = tf.sequential();
      model.add(tf.layers.dense({units: 1, inputShape: [1]}));

      // Prepare the model for training: Specify the loss and the optimizer.
      model.compile({loss: 'meanSquaredError', optimizer: 'sgd'});

      // Generate some synthetic data for training.
      const xs = tf.tensor2d([1, 2, 3, 4], [4, 1]);
      const ys = tf.tensor2d([1, 3, 5, 7], [4, 1]);

      // Train the model using the data.
      model.fit(xs, ys, {epochs: 10}).then(() => {
        // Use the model to do inference on a data point the model hasn't seen before:
        // Open the browser devtools to see the output
        model.predict(tf.tensor2d([5], [1, 1])).print();
      });
    </script>
  </head>

  <body>
  </body>
</html>

npm或yarn

第二种方式是通过npm或yarn将TensorFlow.js的库引入到你的项目中。

yarn add @tensorflow/tfjs  
npm install @tensorflow/tfjs

你可以在你的main.js中添加如下代码：

import * as tf from '@tensorflow/tfjs';

// Define a model for linear regression.
const model = tf.sequential();
model.add(tf.layers.dense({units: 1, inputShape: [1]}));

// Prepare the model for training: Specify the loss and the optimizer.
model.compile({loss: 'meanSquaredError', optimizer: 'sgd'});

// Generate some synthetic data for training.
const xs = tf.tensor2d([1, 2, 3, 4], [4, 1]);
const ys = tf.tensor2d([1, 3, 5, 7], [4, 1]);

// Train the model using the data.
model.fit(xs, ys, {epochs: 10}).then(() => {
  // Use the model to do inference on a data point the model hasn't seen before:
  model.predict(tf.tensor2d([5], [1, 1])).print();
});

如果不懂上面代码的含义不要着急，继续看后面的一些基础概念和用法。

Tensor和Variable

Tensor和Variable是TensorFlow.js中最基础的两种数据形式。那他们到底是什么意思呢？

Tensor在谷歌翻译中是“张量”的意思，“张量”这个词是数学和物理中的一个术语，我们暂且不深究它的意思，你只需要记住，Tensor（张量）是不可变的，类似于const，一旦定义就不能改变它的值。

Variable就很容易理解了，它是变量的意思，顾名思义，它的值是可以改变的。

总之，Tensor（张量）不可变，Variable（变量）可变。

Tensor

张量通常是一个0到多维的数组，构造张量时会用到shape属性，用来规定这是一个几行几列的数组。
请看下面构造一个张量的例子。shape用来规定这个张量是两行三列的数组，然后可以看到最后的输出，我们得到了一个两行三列的二维数组。

// 2x3 Tensor
const shape = [2, 3]; // 2 rows, 3 columns
const a = tf.tensor([1.0, 2.0, 3.0, 10.0, 20.0, 30.0], shape);
a.print(); // print Tensor values
// Output: [[1 , 2 , 3 ],
//          [10, 20, 30]]

也可以用下面这种方式，直接表示这是一个两行三列的二维数组。

// The shape can also be inferred:
const b = tf.tensor([[1.0, 2.0, 3.0], [10.0, 20.0, 30.0]]);
b.print();
// Output: [[1 , 2 , 3 ],
//          [10, 20, 30]]

然而实际上，我们通常使用 tf.scalar, tf.tensor1d, tf.tensor2d, tf.tensor3d 和 tf.tensor4d来构造张量。tf.scalar是构造一个零维数组，也就是一个数字，tf.tensor1d是构造一位数组，tf.tensor2d是构造二维数组，以此类推。例如：

const c = tf.tensor2d([[1.0, 2.0, 3.0], [10.0, 20.0, 30.0]]);
c.print();
// Output: [[1 , 2 , 3 ],
//          [10, 20, 30]]

或者使用tf.zeros生成全是0的数组，tf.ones生成全是1的数组，例如：

// 3x5 Tensor with all values set to 0
const zeros = tf.zeros([3, 5]);
// Output: [[0, 0, 0, 0, 0],
//          [0, 0, 0, 0, 0],
//          [0, 0, 0, 0, 0]]

Variable

而Variable（变量）只能通过Tensor（张量）生成。我们可以使用assign给变量重新赋值。例如：

const initialValues = tf.zeros([5]);
const biases = tf.variable(initialValues); // initialize biases
biases.print(); // output: [0, 0, 0, 0, 0]

const updatedValues = tf.tensor1d([0, 1, 0, 1, 0]);
biases.assign(updatedValues); // update values of biases
biases.print(); // output: [0, 1, 0, 1, 0]

Operations

TensorFlow.js提供了各种向量运算的API，我们可以称这些为Operations。下面是张量平方和张量相加的例子：

const d = tf.tensor2d([[1.0, 2.0], [3.0, 4.0]]);
const d_squared = d.square();
d_squared.print();
// Output: [[1, 4 ],
//          [9, 16]]

const e = tf.tensor2d([[1.0, 2.0], [3.0, 4.0]]);
const f = tf.tensor2d([[5.0, 6.0], [7.0, 8.0]]);

const e_plus_f = e.add(f);
e_plus_f.print();
// Output: [[6 , 8 ],
//          [10, 12]]

而且TensorFlow.js还提供了链式运算，请看例子：

const sq_sum = e.add(f).square();
sq_sum.print();
// Output: [[36 , 64 ],
//          [100, 144]]

// All operations are also exposed as functions in the main namespace,
// so you could also do the following:
const sq_sum = tf.square(tf.add(e, f));

Model

上面我们介绍了张量，变量和一些基础运算，下面我们引入“Model（模型）”这个概念。

模型就是一个函数，给定这个函数特定的输入，会返回特定的输出。

所以请记住，模型就是一个函数而已。

我们来看一个定义模型的例子, 以下代码构造了一个 y = a x ^ 2 + b x + c 的函数表达式，给定一个x，我们会得到一个y。

代码中tf.tidy()看不懂请忽略，我们将在下一节介绍，它只是用来清除内存。

// Define function
function predict(input) {
  // y = a * x ^ 2 + b * x + c
  // More on tf.tidy in the next section
  return tf.tidy(() => {
    const x = tf.scalar(input);

    const ax2 = a.mul(x.square());
    const bx = b.mul(x);
    const y = ax2.add(bx).add(c);

    return y;
  });
}

// Define constants: y = 2x^2 + 4x + 8
const a = tf.scalar(2);
const b = tf.scalar(4);
const c = tf.scalar(8);

// Predict output for input of 2
const result = predict(2);
result.print() // Output: 24

但是通常，我们会使用一个更高级的API去构造模型，那就是用 tf.model 的形式，这里的model只是模型的总称，并没有 tf.modal 这个方法。TensorFlow中最常用的是 tf.sequential，例如：

const model = tf.sequential();
model.add(
  tf.layers.simpleRNN({
    units: 20,
    recurrentInitializer: 'GlorotNormal',
    inputShape: [80, 4]
  })
);

const optimizer = tf.train.sgd(LEARNING_RATE);
model.compile({optimizer, loss: 'categoricalCrossentropy'});
model.fit({x: data, y: labels});

上面代码中一定有很多你不理解的地方，比如什么是 tf.layer？什么是 tf.train.sgd？这里可以先忽略细节，先从总体上体会这些基本概念，关于 tf.train.sg 等我们在后面的文章介绍。如果你忍不住，就自己去查吧！给你官方API文档好了。

内存管理

TensorFlow.js使用GPU来加速运算，所以合理地释放内存是一件很必要的事情。TensorFlow.js提供了dispose函数来释放内存，请看例子：

const x = tf.tensor2d([[0.0, 2.0], [4.0, 6.0]]);
const x_squared = x.square();

x.dispose();
x_squared.dispose();

但是通常实际中我们会面对很多的张量和操作，这时候 tf.tidy 更加方便，因为它是批量释放内存，请看例子：

// tf.tidy takes a function to tidy up after
const average = tf.tidy(() => {
  // tf.tidy will clean up all the GPU memory used by tensors inside
  // this function, other than the tensor that is returned.
  //
  // Even in a short sequence of operations like the one below, a number
  // of intermediate tensors get created. So it is a good practice to
  // put your math ops in a tidy!
  const y = tf.tensor1d([1.0, 2.0, 3.0, 4.0]);
  const z = tf.ones([4]);

  return y.sub(z).square().mean();
});

average.print() // Output: 3.5

使用 tf.tidy 有两个要点：

传递给 tf.tidy 的函数必须是同步的。
tf.tidy 不会清理变量，你只能通过 dispose 手动清理。

总结

关于 TensorFlow.js 的基础概念介绍完了，但是这只是我们探索机器学习的一个工具而已，具体的实践还需要更多的学习，后面有时间我也会跟大家一起学习，并及时分享。

机器学习人工智能

安科网

走进机器学习世界之TensorFlow.js快速上手

HappinessSourceL

前言

安装

直接引入

npm或yarn

Tensor和Variable

Tensor

Variable

Operations

Model

内存管理

总结

HappinessSourceL

相关推荐

TensorFlow为新旧Mac特供新版本，速度最高提升7倍

如何通过7个步骤构建机器学习模型

机器学习新风暴：如何用ML模型预测房价？

关于机器学习管道需要了解什么?

为什么所有的机器学习模型有90％从没有投入生产

LinkedIn开源Dagli，发布Java机器学习函数库

关于机器学习算法的16个技巧

全面解读谷歌云人工智能如何为机器学习提供帮助

关于感知器的故事：机器学习是如何发展到如今这一程度的呢？

需要知识的后深度学习时代，如何高效自动构建知识图谱

10个丰富自我的机器学习项目

机器学习如何颠覆金融行业

24个提高知识和技能极限的机器学习项目

IT自动化和人工智能将在2021年走向何方？

无监督机器学习的重要指南

机器学习概念和经典算法，我用大白话给你讲清楚了！入门必看

理解AI：为什么要在人工智能系统中寻求可解释性呢？

机器学习的未来就在这里：高斯过程和神经网络是等价的

Python 用5行代码学机器学习—线性回归

微软和谷歌分别开源分布式深度学习框架，各自厉害在哪？

HappinessSourceL