Hadoop部署错误解决实例

Hadoop的单机部署很简单也不容易出错,但是对生产环境的价值和意义不大,但是可以快速用于开发。

部署Hadoop的错误原因不少,并且很奇怪。

比如,用户名不同,造成客户端和服务器通讯产生认证失败的错误,客户端,服务器各节点的用户名应当是一致的,并且个节点应该建立ssh的无认证登陆。

相关阅读:

一、出现下面错误:

13/07/09 13:57:07 INFO ipc.Client: Retrying connect to server: master/192.168.2.200:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

java.net.ConnectException: Call to master/192.168.2.200:9000 failed on connection exception: java.net.ConnectException: Connection refused
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1136)
        at org.apache.hadoop.ipc.Client.call(Client.java:1112)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
        at com.sun.proxy.$Proxy7.renewLease(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
        at com.sun.proxy.$Proxy7.renewLease(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:379)
        at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:378)
        at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:400)
        at org.apache.hadoop.hdfs.LeaseRenewer.access$600(LeaseRenewer.java:69)
        at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:273)
        at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:719)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:453)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:579)
        at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:202)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1243)
        at org.apache.hadoop.ipc.Client.call(Client.java:1087)
        ... 14 more

是客户端无法连接服务器造成的,可能是服务器没有启动或者启动了防火墙。

二、出现下面错误:

13/07/09 13:57:36 ERROR hdfs.DFSClient: Failed to close file /tmp/web304069331.log
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/web304069331.log could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)

        at org.apache.hadoop.ipc.Client.call(Client.java:1107)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
        at com.sun.proxy.$Proxy7.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
        at com.sun.proxy.$Proxy7.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023)

原因很多,namenode节点和datanode节点不能正常通讯造成的。根据查询启动日志看到datanode节点没有办法解析机器名造成的,所以修改/etc/hostname和/etc/hosts文件。

相关推荐