Solution to Hive Thrift Client Hang without Any Return
Env:
Cloudera Manager 4.6.1 with CDH4.3
Hadoop 2.0.0-CDH4.3
Hive 0.10.0-CDH4.3
CentOS 6.4 X86_64
Hive started successfully:
[root@n8 hive]# netstat -anlp | grep 10000 tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 21739/java
What I want to do: connect to Hive server via Hive Thrift service with below Groovy script:
import org.apache.hadoop.hive.service.*; import org.apache.thrift.protocol.*; import org.apache.thrift.transport.*; transport = new TSocket("n8.example.com", 10000); protocol = new TBinaryProtocol(transport); client = new HiveClient(protocol); transport.open(); client.execute("show tables"); client.getClusterStatus(); client.getSchema(); client.fetchOne(); client.fetchAll();
The dependencies have been placed into Groovy classpath already:
# load required libraries load !{groovy.home}/lib/*.jar # load user specific libraries load !{user.home}/.groovy/lib/*.jar # tools.jar for ant tasks load ${tools.jar} load ${user.home}/dev/hadoop-2.0.0-cdh4.3.0/share/hadoop/common/*.jar load ${user.home}/dev/hadoop-2.0.0-cdh4.3.0/share/hadoop/common/lib/*.jar load ${user.home}/dev/hadoop-2.0.0-cdh4.3.0/share/hadoop/hdfs/*.jar load ${user.home}/dev/hadoop-2.0.0-cdh4.3.0/share/hadoop/hdfs/lib/*.jar load ${user.home}/dev/hadoop-2.0.0-cdh4.3.0/share/hadoop/mapreduce/*.jar load ${user.home}/dev/hadoop-2.0.0-cdh4.3.0/share/hadoop/mapreduce/lib/*.jar load ${user.home}/dev/hadoop-2.0.0-cdh4.3.0/share/hadoop/tools/lib/*.jar load ${user.home}/dev/hive-0.10.0-cdh4.3.0/lib/*.jar
But when I tried to execute the Groovy script, the executing process was hanged without any return, the connection was reset by the Hive server:
[root@n8 hive]# netstat -anlp | grep 10000 tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 21739/java tcp 1 0 192.168.1.208:10000 192.168.1.100:50761 CLOSE_WAIT 21739/java
Solution:
Stop the HiveServer2, then execute below command to start another HiveServer2 in which host the Hive installed.
hive --service hiveserver 10000 &
Run above Groovy script then you will find that the script will run successfully.
Or you could keep the first HiveServer2 instance running by choosing another port to run another HiveServer2 on.
hive --service hiveserver 10001 &
Reason:
By default hive would create the metastore in whichever directory you are starting the thrift server. So when you keep starting the server from various locations it has different metastores which has no reference to your tables which you would have created some where else.