使用命令行编译打包运行自己的MapReduce程序 Hadoop2.4.1

网上的MapReduce WordCount教程对于如何编译WordCount.java几乎是一笔带过… 而有写到的,大多又是 0.20 等旧版本版本的做法,即 javac -classpath /usr/local/Hadoop/hadoop-1.0.1/hadoop-core-1.0.1.jar WordCount.java,但较新的 2.X 版本中,已经没有 hadoop-core*.jar 这个文件,因此编辑和打包自己的MapReduce程序与旧版本有所不同。

本文以 Hadoop 2.4.1 环境下的WordCount实例来介绍 2.x 版本中如何编辑自己的MapReduce程序。

Hadoop 2.x 版本中的依赖 jar

Hadoop 2.x 版本中jar不再集中在一个 hadoop-core*.jar 中,而是分成多个 jar,如运行WordCount实例需要如下三个 jar:

  • $HADOOP_HOME/share/hadoop/common/hadoop-common-2.4.1.jar
  • $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.4.1.jar
  • $HADOOP_HOME/share/hadoop/common/lib/commons-cli-1.2.jar

编译、打包 Hadoop MapReduce 程序

将上述 jar 添加至 classpath 路径:

export CLASSPATH="$HADOOP_HOME/share/hadoop/common/hadoop-common-2.4.1.jar:$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.4.1.jar:$HADOOP_HOME/share/hadoop/common/lib/commons-cli-1.2.jar:$CLASSPATH"

 

接着就可以编译 WordCount.java 了(使用的是 2.4.1 源码中的 WordCount.java,源码在文本最后面):

javac WordCount.java

编译时会有警告,可以忽略。编译后可以看到生成了几个.class文件。

使用命令行编译打包运行自己的MapReduce程序 Hadoop2.4.1使用Javac编译自己的MapReduce程序

接着把 .class 文件打包成 jar,才能在 Hadoop 中运行:

jar -cvf WordCount.jar ./WordCount*.class

打包完成后,运行试试,创建几个输入文件:

Mkdir input
echo "echo of the rainbow" > ./input/file0
echo "the waiting game" > ./input/file1

使用命令行编译打包运行自己的MapReduce程序 Hadoop2.4.1创建WordCount的输入

开始运行:

/usr/local/hadoop/bin/hadoop jar WordCount.jar WordCount input output

不过这边可能会遇到如下的提示 Exception in thread "main" java.lang.NoClassDefFoundError: WordCount

使用命令行编译打包运行自己的MapReduce程序 Hadoop2.4.1提示找不到 WordCount 类

因为程序中声明了 package ,所以在命令中也要 org.apache.hadoop.examples 写完整:

/usr/local/hadoop/bin/hadoop jar WordCount.jar org.apache.hadoop.examples.WordCount input output

正确运行后的结果如下:

使用命令行编译打包运行自己的MapReduce程序 Hadoop2.4.1WordCount 运行结果

进阶:使用Eclipse编译运行MapReduce程序

使用命令行编译运行MapReduce程序毕竟有些麻烦,修改一次就得手动编译、打包一次,使用Eclipse编译运行MapReduce程序会更加方便。

WordCount.java 源码

文件位于 hadoop-2.4.1-src\hadoop-mapreduce-project\hadoop-mapreduce-examples\src\main\java\org\apache\hadoop\examples 中:

  1. <span class="com">/**</span>
  2. <span class="com">* Licensed to the Apache Software Foundation (ASF) under one</span>
  3. <span class="com">* or more contributor license agreements. See the NOTICE file</span>
  4. <span class="com">* distributed with this work for additional information</span>
  5. <span class="com">* regarding copyright ownership. The ASF licenses this file</span>
  6. <span class="com">* to you under the Apache License, Version 2.0 (the</span>
  7. <span class="com">* "License"); you may not use this file except in compliance</span>
  8. <span class="com">* with the License. You may obtain a copy of the License at</span>
  9. <span class="com">*</span>
  10. <span class="com">* http://www.apache.org/licenses/LICENSE-2.0</span>
  11. <span class="com">*</span>
  12. <span class="com">* Unless required by applicable law or agreed to in writing, software</span>
  13. <span class="com">* distributed under the License is distributed on an "AS IS" BASIS,</span>
  14. <span class="com">* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.</span>
  15. <span class="com">* See the License for the specific language governing permissions and</span>
  16. <span class="com">* limitations under the License.</span>
  17. <span class="com">*/</span>
  18. <span class="kwd">package</span><span class="pln"> org</span><span class="pun">.</span><span class="pln">apache</span><span class="pun">.</span><span class="pln">hadoop</span><span class="pun">.</span><span class="pln">examples</span><span class="pun">;</span>
  19. <span class="kwd">import</span><span class="pln"> java</span><span class="pun">.</span><span class="pln">io</span><span class="pun">.</span><span class="typ">IOException</span><span class="pun">;</span>
  20. <span class="kwd">import</span><span class="pln"> java</span><span class="pun">.</span><span class="pln">util</span><span class="pun">.</span><span class="typ">StringTokenizer</span><span class="pun">;</span>
  21. <span class="kwd">import</span><span class="pln"> org</span><span class="pun">.</span><span class="pln">apache</span><span class="pun">.</span><span class="pln">hadoop</span><span class="pun">.</span><span class="pln">conf</span><span class="pun">.</span><span class="typ">Configuration</span><span class="pun">;</span>
  22. <span class="kwd">import</span><span class="pln"> org</span><span class="pun">.</span><span class="pln">apache</span><span class="pun">.</span><span class="pln">hadoop</span><span class="pun">.</span><span class="pln">fs</span><span class="pun">.</span><span class="typ">Path</span><span class="pun">;</span>
  23. <span class="kwd">import</span><span class="pln"> org</span><span class="pun">.</span><span class="pln">apache</span><span class="pun">.</span><span class="pln">hadoop</span><span class="pun">.</span><span class="pln">io</span><span class="pun">.</span><span class="typ">IntWritable</span><span class="pun">;</span>
  24. <span class="kwd">import</span><span class="pln"> org</span><span class="pun">.</span><span class="pln">apache</span><span class="pun">.</span><span class="pln">hadoop</span><span class="pun">.</span><span class="pln">io</span><span class="pun">.</span><span class="typ">Text</span><span class="pun">;</span>
  25. <span class="kwd">import</span><span class="pln"> org</span><span class="pun">.</span><span class="pln">apache</span><span class="pun">.</span><span class="pln">hadoop</span><span class="pun">.</span><span class="pln">mapreduce</span><span class="pun">.</span><span class="typ">Job</span><span class="pun">;</span>
  26. <span class="kwd">import</span><span class="pln"> org</span><span class="pun">.</span><span class="pln">apache</span><span class="pun">.</span><span class="pln">hadoop</span><span class="pun">.</span><span class="pln">mapreduce</span><span class="pun">.</span><span class="typ">Mapper</span><span class="pun">;</span>
  27. <span class="kwd">import</span><span class="pln"> org</span><span class="pun">.</span><span class="pln">apache</span><span class="pun">.</span><span class="pln">hadoop</span><span class="pun">.</span><span class="pln">mapreduce</span><span class="pun">.</span><span class="typ">Reducer</span><span class="pun">;</span>
  28. <span class="kwd">import</span><span class="pln"> org</span><span class="pun">.</span><span class="pln">apache</span><span class="pun">.</span><span class="pln">hadoop</span><span class="pun">.</span><span class="pln">mapreduce</span><span class="pun">.</span><span class="pln">lib</span><span class="pun">.</span><span class="pln">input</span><span class="pun">.</span><span class="typ">FileInputFormat</span><span class="pun">;</span>
  29. <span class="kwd">import</span><span class="pln"> org</span><span class="pun">.</span><span class="pln">apache</span><span class="pun">.</span><span class="pln">hadoop</span><span class="pun">.</span><span class="pln">mapreduce</span><span class="pun">.</span><span class="pln">lib</span><span class="pun">.</span><span class="pln">output</span><span class="pun">.</span><span class="typ">FileOutputFormat</span><span class="pun">;</span>
  30. <span class="kwd">import</span><span class="pln"> org</span><span class="pun">.</span><span class="pln">apache</span><span class="pun">.</span><span class="pln">hadoop</span><span class="pun">.</span><span class="pln">util</span><span class="pun">.</span><span class="typ">GenericOptionsParser</span><span class="pun">;</span>
  31. <span class="kwd">public</span><span class="kwd">class</span><span class="typ">WordCount</span><span class="pun">{</span>
  32. <span class="kwd">public</span><span class="kwd">static</span><span class="kwd">class</span><span class="typ">TokenizerMapper</span>
  33. <span class="kwd">extends</span><span class="typ">Mapper</span><span class="pun"><</span><span class="typ">Object</span><span class="pun">,</span><span class="typ">Text</span><span class="pun">,</span><span class="typ">Text</span><span class="pun">,</span><span class="typ">IntWritable</span><span class="pun">>{</span>
  34. <span class="kwd">private</span><span class="kwd">final</span><span class="kwd">static</span><span class="typ">IntWritable</span><span class="pln"> one </span><span class="pun">=</span><span class="kwd">new</span><span class="typ">IntWritable</span><span class="pun">(</span><span class="lit">1</span><span class="pun">);</span>
  35. <span class="kwd">private</span><span class="typ">Text</span><span class="pln"> word </span><span class="pun">=</span><span class="kwd">new</span><span class="typ">Text</span><span class="pun">();</span>
  36. <span class="kwd">public</span><span class="kwd">void</span><span class="pln"> map</span><span class="pun">(</span><span class="typ">Object</span><span class="pln"> key</span><span class="pun">,</span><span class="typ">Text</span><span class="pln"> value</span><span class="pun">,</span><span class="typ">Context</span><span class="pln"> context</span>
  37. <span class="pun">)</span><span class="kwd">throws</span><span class="typ">IOException</span><span class="pun">,</span><span class="typ">InterruptedException</span><span class="pun">{</span>
  38. <span class="typ">StringTokenizer</span><span class="pln"> itr </span><span class="pun">=</span><span class="kwd">new</span><span class="typ">StringTokenizer</span><span class="pun">(</span><span class="pln">value</span><span class="pun">.</span><span class="pln">toString</span><span class="pun">());</span>
  39. <span class="kwd">while</span><span class="pun">(</span><span class="pln">itr</span><span class="pun">.</span><span class="pln">hasMoreTokens</span><span class="pun">())</span><span class="pun">{</span>
  40. <span class="pln">word</span><span class="pun">.</span><span class="pln">set</span><span class="pun">(</span><span class="pln">itr</span><span class="pun">.</span><span class="pln">nextToken</span><span class="pun">());</span>
  41. <span class="pln">context</span><span class="pun">.</span><span class="pln">write</span><span class="pun">(</span><span class="pln">word</span><span class="pun">,</span><span class="pln"> one</span><span class="pun">);</span>
  42. <span class="pun">}</span>
  43. <span class="pun">}</span>
  44. <span class="pun">}</span>
  45. <span class="kwd">public</span><span class="kwd">static</span><span class="kwd">class</span><span class="typ">IntSumReducer</span>
  46. <span class="kwd">extends</span><span class="typ">Reducer</span><span class="pun"><</span><span class="typ">Text</span><span class="pun">,</span><span class="typ">IntWritable</span><span class="pun">,</span><span class="typ">Text</span><span class="pun">,</span><span class="typ">IntWritable</span><span class="pun">></span><span class="pun">{</span>
  47. <span class="kwd">private</span><span class="typ">IntWritable</span><span class="pln"> result </span><span class="pun">=</span><span class="kwd">new</span><span class="typ">IntWritable</span><span class="pun">();</span>
  48. <span class="kwd">public</span><span class="kwd">void</span><span class="pln"> reduce</span><span class="pun">(</span><span class="typ">Text</span><span class="pln"> key</span><span class="pun">,</span><span class="typ">Iterable</span><span class="pun"><</span><span class="typ">IntWritable</span><span class="pun">></span><span class="pln"> values</span><span class="pun">,</span>
  49. <span class="typ">Context</span><span class="pln"> context</span>
  50. <span class="pun">)</span><span class="kwd">throws</span><span class="typ">IOException</span><span class="pun">,</span><span class="typ">InterruptedException</span><span class="pun">{</span>
  51. <span class="kwd">int</span><span class="pln"> sum </span><span class="pun">=</span><span class="lit">0</span><span class="pun">;</span>
  52. <span class="kwd">for</span><span class="pun">(</span><span class="typ">IntWritable</span><span class="pln"> val </span><span class="pun">:</span><span class="pln"> values</span><span class="pun">)</span><span class="pun">{</span>
  53. <span class="pln">sum </span><span class="pun">+=</span><span class="pln"> val</span><span class="pun">.</span><span class="pln">get</span><span class="pun">();</span>
  54. <span class="pun">}</span>
  55. <span class="pln">result</span><span class="pun">.</span><span class="pln">set</span><span class="pun">(</span><span class="pln">sum</span><span class="pun">);</span>
  56. <span class="pln">context</span><span class="pun">.</span><span class="pln">write</span><span class="pun">(</span><span class="pln">key</span><span class="pun">,</span><span class="pln"> result</span><span class="pun">);</span>
  57. <span class="pun">}</span>
  58. <span class="pun">}</span>
  59. <span class="kwd">public</span><span class="kwd">static</span><span class="kwd">void</span><span class="pln"> main</span><span class="pun">(</span><span class="typ">String</span><span class="pun">[]</span><span class="pln"> args</span><span class="pun">)</span><span class="kwd">throws</span><span class="typ">Exception</span><span class="pun">{</span>
  60. <span class="typ">Configuration</span><span class="pln"> conf </span><span class="pun">=</span><span class="kwd">new</span><span class="typ">Configuration</span><span class="pun">();</span>
  61. <span class="typ">String</span><span class="pun">[]</span><span class="pln"> otherArgs </span><span class="pun">=</span><span class="kwd">new</span><span class="typ">GenericOptionsParser</span><span class="pun">(</span><span class="pln">conf</span><span class="pun">,</span><span class="pln"> args</span><span class="pun">).</span><span class="pln">getRemainingArgs</span><span class="pun">();</span>
  62. <span class="kwd">if</span><span class="pun">(</span><span class="pln">otherArgs</span><span class="pun">.</span><span class="pln">length </span><span class="pun">!=</span><span class="lit">2</span><span class="pun">)</span><span class="pun">{</span>
  63. <span class="typ">System</span><span class="pun">.</span><span class="pln">err</span><span class="pun">.</span><span class="pln">println</span><span class="pun">(</span><span class="str">"Usage: wordcount <in> <out>"</span><span class="pun">);</span>
  64. <span class="typ">System</span><span class="pun">.</span><span class="pln">exit</span><span class="pun">(</span><span class="lit">2</span><span class="pun">);</span>
  65. <span class="pun">}</span>
  66. <span class="typ">Job</span><span class="pln"> job </span><span class="pun">=</span><span class="kwd">new</span><span class="typ">Job</span><span class="pun">(</span><span class="pln">conf</span><span class="pun">,</span><span class="str">"word count"</span><span class="pun">);</span>
  67. <span class="pln">job</span><span class="pun">.</span><span class="pln">setJarByClass</span><span class="pun">(</span><span class="typ">WordCount</span><span class="pun">.</span><span class="kwd">class</span><span class="pun">);</span>
  68. <span class="pln">job</span><span class="pun">.</span><span class="pln">setMapperClass</span><span class="pun">(</span><span class="typ">TokenizerMapper</span><span class="pun">.</span><span class="kwd">class</span><span class="pun">);</span>
  69. <span class="pln">job</span><span class="pun">.</span><span class="pln">setCombinerClass</span><span class="pun">(</span><span class="typ">IntSumReducer</span><span class="pun">.</span><span class="kwd">class</span><span class="pun">);</span>
  70. <span class="pln">job</span><span class="pun">.</span><span class="pln">setReducerClass</span><span class="pun">(</span><span class="typ">IntSumReducer</span><span class="pun">.</span><span class="kwd">class</span><span class="pun">);</span>
  71. <span class="pln">job</span><span class="pun">.</span><span class="pln">setOutputKeyClass</span><span class="pun">(</span><span class="typ">Text</span><span class="pun">.</span><span class="kwd">class</span><span class="pun">);</span>
  72. <span class="pln">job</span><span class="pun">.</span><span class="pln">setOutputValueClass</span><span class="pun">(</span><span class="typ">IntWritable</span><span class="pun">.</span><span class="kwd">class</span><span class="pun">);</span>
  73. <span class="typ">FileInputFormat</span><span class="pun">.</span><span class="pln">addInputPath</span><span class="pun">(</span><span class="pln">job</span><span class="pun">,</span><span class="kwd">new</span><span class="typ">Path</span><span class="pun">(</span><span class="pln">otherArgs</span><span class="pun">[</span><span class="lit">0</span><span class="pun">]));</span>
  74. <span class="typ">FileOutputFormat</span><span class="pun">.</span><span class="pln">setOutputPath</span><span class="pun">(</span><span class="pln">job</span><span class="pun">,</span><span class="kwd">new</span><span class="typ">Path</span><span class="pun">(</span><span class="pln">otherArgs</span><span class="pun">[</span><span class="lit">1</span><span class="pun">]));</span>
  75. <span class="typ">System</span><span class="pun">.</span><span class="pln">exit</span><span class="pun">(</span><span class="pln">job</span><span class="pun">.</span><span class="pln">waitForCompletion</span><span class="pun">(</span><span class="kwd">true</span><span class="pun">)</span><span class="pun">?</span><span class="lit">0</span><span class="pun">:</span><span class="lit">1</span><span class="pun">);</span>
  76. <span class="pun">}</span>
  77. <span class="pun">}</span>

相关推荐