TensorFlow 研究实践笔记
二、深度学习研究
三、TensorFlow安装:
安装环境:Ubuntu15.10_64
1、下载源码
sudo apt-get install git
git clone - -recurse-submodules https://github.com/tensorflow/tensorflow
–recurse-submodules 参数必须要加, 用于获取 TesorFlow 依赖的 protobuf 库
Cloning into 'tensorflow'... remote: Counting objects: 40348, done. remote: Compressing objects: 100% (7/7), done. remote: Total 40348 (delta 0), reused 0 (delta 0), pack-reused 40341 Receiving objects: 100% (40348/40348), 35.45 MiB | 404.00 KiB/s, done. Resolving deltas: 100% (29338/29338), done. Checking connectivity... done. Submodule 'google/protobuf' (https://github.com/google/protobuf.git) registered for path 'google/protobuf' Cloning into 'google/protobuf'... remote: Counting objects: 32801, done. remote: Compressing objects: 100% (34/34), done. remote: Total 32801 (delta 12), reused 0 (delta 0), pack-reused 32767 Receiving objects: 100% (32801/32801), 31.27 MiB | 1.27 MiB/s, done. Resolving deltas: 100% (22019/22019), done. Checking connectivity... done. Submodule path 'google/protobuf': checked out 'fb714b3606bd663b823f6960a73d052f97283b74'
2、安装Bazel
OpenJDK做为GPL许可(GPL-licensed)的Java平台的开源化实现,Sun正式发布它已经六年有余。从发布那一时刻起,Java社区的大众们就又开始努力学习,以适应这个新的开源代码基础(code-base)。 [1]
OpenJDK在2013年发展迅速,被著名IT杂志SD Times评选为2013 SD Times 100,位于“极大影响力”分类第9位。
Google日前开源了他们内部使用的构建工具Bazel。
Bazel是一个类似于Make的工具,是Google为其内部软件开发的特点量身定制的工具,如今Google使用它来构建内部大多数的软件。它的功能有诸多亮点:
多语言支持:目前Bazel默认支持Java、Objective-C和C++,但可以被扩展到其他任何变成语言。
高级构建描述语言:项目是使用一种叫BUILD的语言来描述的,它是一种简洁的文本语言,它把一个项目视为一个集合,这个集合由一些互相关联的库、二进制文件和测试用例组成。相反,像Make这样的工具,需要去描述每个文件如何调用编译器。
多平台支持:同一套工具和相同的BUILD文件可以用来为不同的体系结构构建软件,甚至是不同的平台。在Google,Bazel被同时用在数据中心系统中的服务器应用和手机端的移动应用上。
可重复性:在BUILD文件中,每个库、测试用例和二进制文件都需要明确指定它们的依赖关系。当一个源码文件被修改时,Bazel凭这些依赖来判断哪些部分需要重新构建,以及哪些任务可以并行进行。这意味着所有构建都是增量的,并且相同构建总是产生一样的结果。
可伸缩性:Bazel可以处理大型项目;在Google,一个服务器软件有十万行代码是很常见的,在什么都不改的前提下重新构建这样一个项目,大概只需要200毫秒。
安装Bazel依赖库
sudo apt-get install openjdk-8-jdk openjdk-8-source
oot.pem Adding debian:E-Tugra_Certification_Authority.pem Adding debian:Staat_der_Nederlanden_EV_Root_CA.pem Adding debian:GlobalSign_ECC_Root_CA_-_R4.pem Adding debian:Certinomis_-_Autorité_Racine.pem Adding debian:ssl-cert-snakeoil.pem Adding debian:COMODO_Certification_Authority.pem done. Processing triggers for libc-bin (2.21-0ubuntu4) ... Processing triggers for ca-certificates (20150426ubuntu1) ... Updating certificates in /etc/ssl/certs... 0 added, 0 removed; done. Running hooks in /etc/ca-certificates/update.d... done. done. learning@learning-virtual-machine:~$
sudo apt-get install pkg-config zip g++ zlib1g-dev unzip
Processing triggers for mime-support (3.54ubuntu1.1) ... Setting up libstdc++-4.8-dev:amd64 (4.8.4-2ubuntu1~14.04.1) ... Setting up g++-4.8 (4.8.4-2ubuntu1~14.04.1) ... Setting up g++ (4:4.8.2-1ubuntu6) ... update-alternatives: using /usr/bin/g++ to provide /usr/bin/c++ (c++) in auto mode Setting up unzip (6.0-9ubuntu1.5) ... Setting up zlib1g-dev:amd64 (1:1.2.8.dfsg-1ubuntu1) ... @ubuntu:~$
下载链接:https://github.com/bazelbuild/bazel/releases/download/0.2.2b/bazel-0.2.2b-installer-linux-x86_64.sh
@ubuntu:~$ chmod +x bazel-0.2.2b-installer-linux-x86_64.sh
@ubuntu:~$ ./bazel-0.2.2b-installer-linux-x86_64.sh –user
Bazel is now installed! Make sure you have "/home/learning/bin" in your path. You can also activate bash completion by adding the following line to your ~/.bashrc: source /home/learning/.bazel/bin/bazel-complete.bash See http://bazel.io/docs/getting-started.html to start a new project! learning@learning-virtual-machine:~$ source /home/learning/.bazel/bin/bazel-complete.bash learning@learning-virtual-machine:~$
export PATH="$PATH:$HOME/bin"
sudo apt-get install Python-numpy swig python-dev
blapack.so.3 (liblapack.so.3) in auto mode Setting up libpython-dev:amd64 (2.7.5-5ubuntu3) ... Setting up python2.7-dev (2.7.6-8ubuntu0.2) ... Setting up python-dev (2.7.5-5ubuntu3) ... Setting up python-numpy (1:1.8.2-0ubuntu0.1) ... Setting up swig2.0 (2.0.11-1ubuntu2) ... Setting up swig (2.0.11-1ubuntu2) ... Processing triggers for libc-bin (2.19-0ubuntu6.5) ...
3、
mkdir /tmp/tensorflow_pkg
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-py2-none-any.whl
learning@learning-virtual-machine:~$ pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-py2-none-any.whl Requirement '/tmp/tensorflow_pkg/tensorflow-0.5.0-py2-none-any.whl' looks like a filename, but the file does not exist Unpacking /tmp/tensorflow_pkg/tensorflow-0.5.0-py2-none-any.whl Cleaning up... Exception: Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/pip/basecommand.py", line 122, in main status = self.run(options, args) File "/usr/lib/python2.7/dist-packages/pip/commands/install.py", line 304, in run requirement_set.prepare_files(finder, force_root_egg_info=self.bundle, bundle=self.bundle) File "/usr/lib/python2.7/dist-packages/pip/req.py", line 1198, in prepare_files do_download, File "/usr/lib/python2.7/dist-packages/pip/req.py", line 1365, in unpack_url unpack_file_url(link, location, download_dir) File "/usr/lib/python2.7/dist-packages/pip/download.py", line 640, in unpack_file_url unpack_file(from_path, location, content_type, link) File "/usr/lib/python2.7/dist-packages/pip/util.py", line 640, in unpack_file unzip_file(filename, location, flatten=not filename.endswith(('.pybundle', '.whl'))) File "/usr/lib/python2.7/dist-packages/pip/util.py", line 508, in unzip_file zipfp = open(filename, 'rb') IOError: [Errno 2] No such file or directory: '/tmp/tensorflow_pkg/tensorflow-0.5.0-py2-none-any.whl' Storing debug log for failure in /home/learning/.pip/pip.log learning@learning-virtual-machine:~$
使用pip编译并安装
bazel build -c opt tensorflow/tools/pip_package:build_pip_package
learning@learning-virtual-machine:~/tensorflow$ bazel build -c opt tensorflow/tools/pip_package:build_pip_package Sending SIGTERM to previous Bazel server (pid=17411)... done. ....................................... INFO: Waiting for response from Bazel server (pid 18433)... INFO: Downloading from https://bitbucket.org/eigen/eigen/get/50812b426b7c.tar.\ gz: 0B
出现问题:
ERROR: /home/learning/tensorflow/tensorflow/core/kernels/BUILD:640:1: C++ compilation of rule '//tensorflow/core/kernels:padding_fifo_queue' failed: gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -Wl,-z,-relro,-z,now -B/usr/bin -B/usr/bin -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 ... (remaining 70 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 4. gcc: internal compiler error: Killed (program cc1plus) Please submit a full bug report, with preprocessed source if appropriate. See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions. [1,604 / 2,192] Still waiting for 199 jobs to complete: Running (standalone): Compiling tensorflow/core/kernels/queue_base.cc, 5653 s Compiling tensorflow/core/kernels/split_lib_cpu.cc, 15 s
解决:内存不够,将虚拟机内存改为4G,编译成功
INFO: From Compiling
tensorflow/contrib/tensor_forest/core/ops/update_fertile_slots_op.cc:
tensorflow/contrib/tensor_forest/core/ops/update_fertile_slots_op.cc:
In member function ‘virtual void
tensorflow::UpdateFertileSlots::Compute(tensorflow::OpKernelContext*)’:
tensorflow/contrib/tensor_forest/core/ops/update_fertile_slots_op.cc:176:14:
warning: comparison between signed and unsigned integer expressions
[-Wsign-compare]
for (; i < values->size(); ++i) {
^ tensorflow/contrib/tensor_forest/core/ops/update_fertile_slots_op.cc:
In member function ‘void
tensorflow::UpdateFertileSlots::SetNewNonFertileLeaves(tensorflow::UpdateFertileSlots::HeapValuesType*,
int, tensorflow::OpKernelContext*)’:
tensorflow/contrib/tensor_forest/core/ops/update_fertile_slots_op.cc:340:29:
warning: comparison between signed and unsigned integer expressions
[-Wsign-compare]
for (int32 i = start; i < values->size(); ++i) {
^ Target //tensorflow/tools/pip_package:build_pip_package up-to-date:
bazel-bin/tensorflow/tools/pip_package/build_pip_package INFO: Elapsed
time: 9696.811s, Critical Path: 7936.35s
bazel build -c opt tensorflow/tools/pip_package:build_pip_package
learning@learning-virtual-machine:~/tensorflow$ mkdir /tmp/tensorflow_pkg
learning@learning-virtual-machine:~/tensorflow$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
2016年 05月 06日 星期五 11:22:55 CST : === Using tmpdir: /tmp/tmp.n9viqhep4u
/tmp/tmp.n9viqhep4u ~/tensorflow
2016年 05月 06日 星期五 11:23:01 CST : === Building wheel
2016年 05月 06日 星期五 11:24:09 CST : === Output wheel file is in: /tmp/tensorflow_pkg
learning@learning-virtual-machine:~/tensorflow$
pip install /tmp/tensorflow_pkg/tensorflow-0.8.0-py2-none-any.whl
learning@learning-virtual-machine:/tmp/tensorflow_pkg$ pip install /tmp/tensorflow_pkg/tensorflow-0.8.0-py2-none-any.whl Unpacking ./tensorflow-0.8.0-py2-none-any.whl Downloading/unpacking six>=1.10.0 (from tensorflow==0.8.0) Cannot fetch index base URL https://pypi.python.org/simple/ Downloading six-1.10.0-py2.py3-none-any.whl Downloading/unpacking protobuf==3.0.0b2 (from tensorflow==0.8.0) Downloading protobuf-3.0.0b2-py2.py3-none-any.whl (326kB): 326kB downloaded Downloading/unpacking wheel (from tensorflow==0.8.0) Downloading wheel-0.29.0-py2.py3-none-any.whl (66kB): 66kB downloaded Downloading/unpacking numpy>=1.8.2 (from tensorflow==0.8.0)
m/mtrand/randomkit.o build/temp.linux-x86_64-2.7/numpy/random/mtrand/initarray.o build/temp.linux-x86_64-2.7/numpy/random/mtrand/distributions.o -Lbuild/temp.linux-x86_64-2.7 -o build/lib.linux-x86_64-2.7/numpy/random/mtrand.so
Creating build/scripts.linux-x86_64-2.7/f2py
adding ‘build/scripts.linux-x86_64-2.7/f2py’ to scripts
changing mode of build/scripts.linux-x86_64-2.7/f2py from 664 to 775
warning: no previously-included files matching '*.pyo' found anywhere in distribution warning: no previously-included files matching '*.pyd' found anywhere in distribution changing mode of /home/learning/.local/bin/f2py to 775
Successfully installed tensorflow six protobuf wheel numpy setuptools
Cleaning up…
创建 pip 包并安装,编译安装结束。
1、问题:
The 'build' command is only supported from within a workspace.
解决方法:
learning@learning-virtual-machine:**~/tensorflow**$ bazel build -c opt tensorflow/tools/pip_package:build_pip_package .........................
2、问题:
INFO: Waiting for response from Bazel server (pid 15464)… ERROR:
/home/learning/tensorflow/WORKSPACE:16:6: First argument of load() is
a path, not a label. It should start with a single slash if it is an
absolute path.. ERROR: /home/learning/tensorflow/WORKSPACE:20:6: First
argument of load() is a path, not a label. It should start with a
single slash if it is an absolute path.. ERROR: WORKSPACE file could
not be parsed. ERROR: no such package ‘external’: Package ‘external’
contains errors. INFO: Elapsed time: 9.814s
解决方法:bazel版本低,换成0.2.2
源码分析:
example_trainer.cc
/* Copyright 2015 Google Inc. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ==============================================================================*/ #include <cstdio> #include <functional> #include <string> #include <vector> #include "tensorflow/cc/ops/standard_ops.h" #include "tensorflow/core/framework/graph.pb.h" #include "tensorflow/core/framework/tensor.h" #include "tensorflow/core/graph/default_device.h" #include "tensorflow/core/graph/graph_def_builder.h" #include "tensorflow/core/lib/core/threadpool.h" #include "tensorflow/core/lib/strings/stringprintf.h" #include "tensorflow/core/platform/init_main.h" #include "tensorflow/core/platform/logging.h" #include "tensorflow/core/platform/types.h" #include "tensorflow/core/public/session.h" using tensorflow::string; using tensorflow::int32; namespace tensorflow { namespace example { struct Options { int num_concurrent_sessions = 10; // The number of concurrent sessions int num_concurrent_steps = 10; // The number of concurrent steps int num_iterations = 100; // Each step repeats this many times bool use_gpu = false; // Whether to use gpu in the training }; // A = [3 2; -1 0]; x = rand(2, 1); // We want to compute the largest eigenvalue for A. // repeat x = y / y.norm(); y = A * x; end GraphDef CreateGraphDef() { // TODO(jeff,opensource): This should really be a more interesting // computation. Maybe turn this into an mnist model instead? GraphDefBuilder b; using namespace ::tensorflow::ops; // NOLINT(build/namespaces) // Store rows [3, 2] and [-1, 0] in row major format. Node* a = Const({3.f, 2.f, -1.f, 0.f}, {2, 2}, b.opts()); // x is from the feed. Node* x = Const({0.f}, {2, 1}, b.opts().WithName("x")); // y = A * x Node* y = MatMul(a, x, b.opts().WithName("y")); // y2 = y.^2 Node* y2 = Square(y, b.opts()); // y2_sum = sum(y2) Node* y2_sum = Sum(y2, Const(0, b.opts()), b.opts()); // y_norm = sqrt(y2_sum) Node* y_norm = Sqrt(y2_sum, b.opts()); // y_normalized = y ./ y_norm Div(y, y_norm, b.opts().WithName("y_normalized")); GraphDef def; TF_CHECK_OK(b.ToGraphDef(&def)); return def; } string DebugString(const Tensor& x, const Tensor& y) { CHECK_EQ(x.NumElements(), 2); CHECK_EQ(y.NumElements(), 2); auto x_flat = x.flat<float>(); auto y_flat = y.flat<float>(); const float lambda = y_flat(0) / x_flat(0); return strings::Printf("lambda = %8.6f x = [%8.6f %8.6f] y = [%8.6f %8.6f]", lambda, x_flat(0), x_flat(1), y_flat(0), y_flat(1)); } void ConcurrentSteps(const Options* opts, int session_index) { // Creates a session. SessionOptions options; std::unique_ptr<Session> session(NewSession(options)); GraphDef def = CreateGraphDef(); if (options.target.empty()) { graph::SetDefaultDevice(opts->use_gpu ? "/gpu:0" : "/cpu:0", &def); } TF_CHECK_OK(session->Create(def)); // Spawn M threads for M concurrent steps. const int M = opts->num_concurrent_steps; thread::ThreadPool step_threads(Env::Default(), "trainer", M); for (int step = 0; step < M; ++step) { step_threads.Schedule([&session, opts, session_index, step]() { // Randomly initialize the input. Tensor x(DT_FLOAT, TensorShape({2, 1})); x.flat<float>().setRandom(); // Iterations. std::vector<Tensor> outputs; for (int iter = 0; iter < opts->num_iterations; ++iter) { outputs.clear(); TF_CHECK_OK( session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)); CHECK_EQ(size_t{2}, outputs.size()); const Tensor& y = outputs[0]; const Tensor& y_norm = outputs[1]; // Print out lambda, x, and y. std::printf("%06d/%06d %s\n", session_index, step, DebugString(x, y).c_str()); // Copies y_normalized to x. x = y_norm; } }); } TF_CHECK_OK(session->Close()); } void ConcurrentSessions(const Options& opts) { // Spawn N threads for N concurrent sessions. const int N = opts.num_concurrent_sessions; thread::ThreadPool session_threads(Env::Default(), "trainer", N); for (int i = 0; i < N; ++i) { session_threads.Schedule(std::bind(&ConcurrentSteps, &opts, i)); } } } // end namespace example } // end namespace tensorflow namespace { bool ParseInt32Flag(tensorflow::StringPiece arg, tensorflow::StringPiece flag, int32* dst) { if (arg.Consume(flag) && arg.Consume("=")) { char extra; return (sscanf(arg.data(), "%d%c", dst, &extra) == 1); } return false; } bool ParseBoolFlag(tensorflow::StringPiece arg, tensorflow::StringPiece flag, bool* dst) { if (arg.Consume(flag)) { if (arg.empty()) { *dst = true; return true; } if (arg == "=true") { *dst = true; return true; } else if (arg == "=false") { *dst = false; return true; } } return false; } } // namespace int main(int argc, char* argv[]) { tensorflow::example::Options opts; std::vector<char*> unknown_flags; for (int i = 1; i < argc; ++i) { if (string(argv[i]) == "--") { while (i < argc) { unknown_flags.push_back(argv[i]); ++i; } break; } if (ParseInt32Flag(argv[i], "--num_concurrent_sessions", &opts.num_concurrent_sessions) || ParseInt32Flag(argv[i], "--num_concurrent_steps", &opts.num_concurrent_steps) || ParseInt32Flag(argv[i], "--num_iterations", &opts.num_iterations) || ParseBoolFlag(argv[i], "--use_gpu", &opts.use_gpu)) { continue; } fprintf(stderr, "Unknown flag: %s\n", argv[i]); return -1; } // Passthrough any unknown flags. int dst = 1; // Skip argv[0] for (char* f : unknown_flags) { argv[dst++] = f; } argv[dst++] = nullptr; argc = unknown_flags.size() + 1; tensorflow::port::InitMain(argv[0], &argc, &argv); tensorflow::example::ConcurrentSessions(opts); }