crontab定时运行MR不行,手动shell可以执行成功问题排查过程
设置了定时任务,但MR任务没有执行。
第一步:手动执行shell脚本,如果有问题,检查相关设置,如source/etc/profile,绝对路径之类这里不是重点,手动可以执行成功
第二步:检查shell脚本文件格式,设置测试输出,确保crontab任务调度没有问题,测试hymtest.sh
#!/bin/bash
DATE=$(date+%Y%m%d:%H:%M:%S)
echo$DATE+"everyminutetest">>/bigdata/shell/hymoutput.txt
echo导入每天指数涨跌排行数据{存到:hbase:"jmdata:topIndex"}>>/bigdata/shell/hymoutput.txt
hadoopjar/bigdata/cdh/jmdata-jdata-mrs-index.jarorg.jumao.jdata.mrs.index.TopIndexMR
echo"endtopIndexMR">>/bigdata/shell/hymoutput.txt
第三步:
定时任务调度室成功的,检查定时任务相关输出
cat/etc/crontab
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/
#Fordetailsseeman4crontabs
#Exampleofjobdefinition:
#.----------------minute(0-59)
#|.-------------hour(0-23)
#||.----------dayofmonth(1-31)
#|||.-------month(1-12)ORjan,feb,mar,apr...
#||||.----dayofweek(0-6)(Sunday=0or7)ORsun,mon,tue,wed,thu,fri,sat
#|||||
#*****user-namecommandtobeexecuted
152,15***rootsh/bigdata/shell/2-35indexmrs.sh
202,15***rootsh/bigdata/shell/2-40impexpmrs.sh
200,8***rootsh/bigdata/shell/0,8-20futuresmrs1.sh
250,8***rootsh/bigdata/shell/0,8-25futuresmrs2.sh
457,10,15***rootsh/bigdata/shell/1,7,10-45pricemrs.sh
40*/2***rootsh/bigdata/shell/2homemrs1.sh
42*/2***rootsh/bigdata/shell/2homemrs2.sh
48*/2***rootsh/bigdata/shell/2homemrs3.sh
10*/2***rootsh/bigdata/shell/2topmrs.sh
50*/1***rootsh/bigdata/shell/dailytaskmrs.sh
111,7,9,10,12,15,17,19***rootsh/bigdata/shell/englishhome.sh
58-16,18,20***rootsh/bigdata/shell/englishtocom.sh
161,7,12,15***rootsh/bigdata/shell/englishprice.sh
301,7,12,15***rootsh/bigdata/shell/englishcategory.sh
26***rootsh/bigdata/shell/englishtaskmrs.sh
502,15***rootsh/bigdata/shell/jmbitaskmr.sh
0031**/bin/sh/bigdata/shell/logclear.sh
*/1****root/etc/profile;/bin/sh/bigdata/shell/hymtest.sh>>/bigdata/shell/hymout.txt2>&1
结果可以看到如下异常:
Exceptioninthread"main"java.lang.NoClassDefFoundError:org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil
atorg.jumao.jdata.mrs.index.TopIndexMR.main(TopIndexMR.java:99)
atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)
atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
atjava.lang.reflect.Method.invoke(Method.java:498)
atorg.apache.hadoop.util.RunJar.run(RunJar.java:221)
atorg.apache.hadoop.util.RunJar.main(RunJar.java:136)
Causedby:java.lang.ClassNotFoundException:org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil
atjava.net.URLClassLoader.findClass(URLClassLoader.java:381)
atjava.lang.ClassLoader.loadClass(ClassLoader.java:424)
atjava.lang.ClassLoader.loadClass(ClassLoader.java:357)
...7more
第四步:
对比手动shell的环境变量env和crontab运行时的环境变量,输出到文件
在hymtest.sh中加入
#!/bin/bash
source/etc/profile
env>>/bigdata/shell/hymout.txt
DATE=$(date+%Y%m%d:%H:%M:%S)
echo$DATE+"everyminutetest">>/bigdata/shell/hymoutput.txt
echo导入每天指数涨跌排行数据{存到:hbase:"jmdata:topIndex"}>>/bigdata/shell/hymoutput.txt
/opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoopjar/bigdata/cdh/jmdata-jdata-mrs-index.jarorg.jumao.jdata.mrs.index.TopIndexMR
echo"endtopIndexMR">>/bigdata/shell/hymoutput.txt
手动env
HOSTNAME=nn1
TERM=xterm
SHELL=/bin/bash
HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
HISTSIZE=1000
SSH_CLIENT=172.18.203.1124937422
QTDIR=/usr/lib64/qt-3.3
QTINC=/usr/lib64/qt-3.3/include
SSH_TTY=/dev/pts/0
USER=root
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.tbz=01;31:*.tbz2=01;31:*.bz=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:
HBASE_HOME=/opt/cloudera/parcels/CDH/lib/hbase
HADOOP_COMMON_LIB_NATIVE_DIR=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native
MAIL=/var/spool/mail/root
PATH=/usr/java/jdk1.8.0_131/bin:/opt/cloudera/parcels/CDH/lib/hadoop/bin:/opt/cloudera/parcels/CDH/lib/hbase/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
PWD=/bigdata/shell
JAVA_HOME=/usr/java/jdk1.8.0_131
HADOOP_CLASSPATH=:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/hive/lib/*:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/hive-hcatalog/share/hcatalog/*
EDITOR=vi
LANG=en_US.UTF-8
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
LOGNAME=root
QTLIB=/usr/lib64/qt-3.3/lib
CVS_RSH=ssh
CLASSPATH=.:/usr/java/jdk1.8.0_131/lib/dt.jar:/usr/java/jdk1.8.0_131/lib/tools.jar
SSH_CONNECTION=172.18.203.11249374172.18.203.11122
LESSOPEN=||/usr/bin/lesspipe.sh%s
G_BROKEN_FILENAMES=1
HIVE_CONF_DIR=/etc/hive/conf
_=/bin/env
OLDPWD=/root
对比发现关键的没有以下环境变量设置,导致找不到相关的class
HADOOP_CLASSPATH=:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/hive/lib/*:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/hive-hcatalog/share/hcatalog/*
第五步,修正,执行成功,在执行MR前导出环境变量,指定MR依赖的包,当然这些依赖的包也可以直接copy到集群每台hadoop的lib下,最初部署MR任务做法就是如此。
#!/bin/bash
source/etc/profile
exportHADOOP_CLASSPATH=:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/hive/lib/*:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/hive-hcatalog/share/hcatalog/*
env>>/bigdata/shell/hymout.txt
DATE=$(date+%Y%m%d:%H:%M:%S)
echo$DATE+"everyminutetest">>/bigdata/shell/hymoutput.txt
echo导入每天指数涨跌排行数据{存到:hbase:"jmdata:topIndex"}>>/bigdata/shell/hymoutput.txt
/opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoopjar/bigdata/cdh/jmdata-jdata-mrs-index.jarorg.jumao.jdata.mrs.index.TopIndexMR
echo"endtopIndexMR">>/bigdata/shell/hymoutput.txt