自定义Hadoop Writable
Hadoop中已经定义了很多Writable的实现,基本上可以符合我们日常使用,但是在一些特殊的场景我们可能还是需要自己去实现Writable,下面主要说明如何去实现自己的Writeable,及使用自定义的Writable作为map/reduce中的key值时遇到的一些问题。
首先需要实现org.apache.hadoop.io.Writable这个接口,该接口有write和readFields这两个方法,write用于写数据,readFields用于读取数据,具体如下:
private MultipleObject multipleObject;
@Override
public void readFields(DataInput dataInput) throws IOException {
length = dataInput.readInt();
bytes = new byte[length];
dataInput.readFully(bytes);
if (multipleObject == null) {
multipleObject = new MultipleObject();
}
multipleObject = SerializeUtil.deserialize(bytes, length,
multipleObject.getClass());
}
@Override
public void write(DataOutput dataOutput) throws IOException {
if (multipleObject == null) {
throw new IOException("Inner multiple object is null");
}
DataOutputBuffer out = SerializeUtil.serialize(multipleObject);
if (out != null) {
bytes = out.getData();
length = out.getData().length;
dataOutput.writeInt(length);
dataOutput.write(bytes);
}
}