Eclipse下第一个Hadoop程序出现错误ClassCastException

pugongying

2012-12-07

java.lang.ClassCastException: interface javax.xml.soap.Text
at java.lang.Class.asSubclass(Unknown Source)
at org.apache.Hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:599)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:791)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)

解决方法:

原来发现是版本的问题,我使用的是如下代码:

并用这些代码运行在hadoop-0.20.2版本上才出现这样的问题,在hadoop-0.20.2上请使用新的接口方法来实现就不会有这样的问题.

public static void main(String[] args) throws IOException {

if(args.length != 2){

System.err.print("Usage: MaxTemperature<input path> <output path>");

System.exit(-1);

}

JobConf conf = new JobConf(MaxTemperature.class);

conf.setJobName("Max temperature");

FileInputFormat.addInputPath(conf, new Path(args[0]));

FileOutputFormat.setOutputPath(conf, new Path(args[1]));

conf.setMapperClass(MaxTemperatureMapper.class);

conf.setReducerClass(MaxTemperatureReducer.class);

conf.setOutputKeyClass(Text.class);

conf.setOutputValueClass(IntWritable.class);

JobClient.runJob(conf);

}

public void map(LongWritable key, Text value,

OutputCollector<Text, IntWritable> output, Reporter reporter)

throws IOException {

String line = value.toString();

System.out.println("key: " + key);

String year = line.substring(15,19);

int airTemperature;

if(line.charAt(45) == '+'){

airTemperature = Integer.parseInt(line.substring(46,50));

}else{

airTemperature = Integer.parseInt(line.substring(45,50));

}

String quality = line.substring(50,51);

System.out.println("quality: " + quality);

if(airTemperature != MISSING && quality.matches("[01459")){

output.collect(new Text(year), new IntWritable(airTemperature));

}

@Override

public void reduce(Text key, Iterator<IntWritable> values,

OutputCollector<Text, IntWritable> output, Reporter reporter)

throws IOException {

int maxValue = Integer.MIN_VALUE;

while(values.hasNext()){

maxValue = Math.max(maxValue, values.next().get());

}

output.collect(key, new IntWritable(maxValue));

}

请将上面的代码改成新版本的写法就不会出现问题.

public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {

if(args.length != 2){

System.err.print("Usage: MaxTemperature<input path> <output path>");

System.exit(-1);

}

Job job = new Job();

job.setJarByClass(NewMaxTemperature.class);

FileInputFormat.addInputPath(job, new Path(args[0]));

FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.setMapperClass(NewMaxTemperatureMapper.class);

job.setReducerClass(NewMaxTemperatureReducer.class);

job.setOutputKeyClass(Text.class);

job.setOutputValueClass(IntWritable.class);

System.exit(job.waitForCompletion(true)?0:1);

public class NewMaxTemperatureMapper extends

Mapper<LongWritable,Text,Text,IntWritable>{

private static final int MISSING= 9999;

@Override

public void map(LongWritable key, Text value,

Context context)

throws IOException ,InterruptedException{

String line = value.toString();

System.out.println("key: " + key);

String year = line.substring(15,19);

int airTemperature;

if(line.charAt(45) == '+'){

airTemperature = Integer.parseInt(line.substring(46,50));

}else{

airTemperature = Integer.parseInt(line.substring(45,50));

}

String quality = line.substring(50,51);

System.out.println("quality: " + quality);

if(airTemperature != MISSING && quality.matches("[01459]")){

context.write(new Text(year), new IntWritable(airTemperature));

}

public class NewMaxTemperatureReducer extends

Reducer<Text, IntWritable, Text, IntWritable> {

@Override

public void reduce(Text key, Iterable<IntWritable> values,

Context context)

throws IOException,InterruptedException {

int maxValue = Integer.MIN_VALUE;

for(IntWritable value: values){

maxValue = Math.max(maxValue, value.get());

}

context.write(key, new IntWritable(maxValue));

}

解决方案2：可以使用书中指定的hadoop版本，可以使用hadoop 0.20.0以前的版本

hadoop

安科网

Eclipse下第一个Hadoop程序出现错误ClassCastException

pugongying

pugongying

相关推荐

Hadoop3.2.0集群搭建常见注意事项

为什么Java仍将是未来的主流语言？

hadoop伪分布式环境搭建

_服役新节点，退役旧节点，多目录配置。+_HDFS2.x的新特性

Hadoop（一）安装

第四周练习

Hadoop小练习

hadoop框架三大组件hdfs、mapreduce、yarn 内容

Hadoop基础（三十三）：Zookeeper 分布式安装部署

Hadoop基础（二十二）：Shuffle机制（三）

hdfs、hive、hbase的搭建总结

NameNode和Zookeeper的format作用

hadoop集群的启动与停止

JStorm介绍

Hadoop2.7.7 centos7 完全分布式配置与问题随记

Hadoop Yarn工作机制 Job提交流程

【赵强老师】大数据工作流引擎Oozie

Hadoop

入门大数据---Spark开发环境搭建

hadoop创建目录

pugongying