`

crontab定时运行MR不行,手动shell可以执行成功问题排查过程

阅读更多
设置了定时任务,但MR任务没有执行。

第一步:手动执行shell脚本, 如果有问题,检查相关设置,如source /etc/profile, 绝对路径之类 这里不是重点, 手动可以执行成功
第二步: 检查shell脚本文件格式, 设置测试输出,确保crontab任务调度没有问题, 测试hymtest.sh

#!/bin/bash
DATE=$(date +%Y%m%d:%H:%M:%S)
echo $DATE + "every minute test">>/bigdata/shell/hymoutput.txt
echo 导入每天指数涨跌排行数据 {存到:hbase:"jmdata:topIndex"}>>/bigdata/shell/hymoutput.txt
hadoop jar /bigdata/cdh/jmdata-jdata-mrs-index.jar org.jumao.jdata.mrs.index.TopIndexMR
echo "end topIndex MR">>/bigdata/shell/hymoutput.txt

第三步:
定时任务调度室成功的,检查定时任务相关输出
cat /etc/crontab
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/

# For details see man 4 crontabs

# Example of job definition:
# .---------------- minute (0 - 59)
# |  .------------- hour (0 - 23)
# |  |  .---------- day of month (1 - 31)
# |  |  |  .------- month (1 - 12) OR jan,feb,mar,apr ...
# |  |  |  |  .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# |  |  |  |  |
# *  *  *  *  * user-name command to be executed
15 2,15 * * * root sh /bigdata/shell/2-35indexmrs.sh
20 2,15 * * * root sh /bigdata/shell/2-40impexpmrs.sh
20 0,8 * * * root sh /bigdata/shell/0,8-20futuresmrs1.sh
25 0,8 * * * root sh /bigdata/shell/0,8-25futuresmrs2.sh
45 7,10,15 * * * root sh /bigdata/shell/1,7,10-45pricemrs.sh
40 */2 * * * root sh /bigdata/shell/2homemrs1.sh
42 */2 * * * root sh /bigdata/shell/2homemrs2.sh
48 */2 * * * root sh /bigdata/shell/2homemrs3.sh
10 */2 * * * root sh /bigdata/shell/2topmrs.sh
50 */1 * * * root sh /bigdata/shell/dailytaskmrs.sh
11 1,7,9,10,12,15,17,19 * * * root sh /bigdata/shell/englishhome.sh
5 8-16,18,20 * * * root sh /bigdata/shell/englishtocom.sh
16 1,7,12,15 * * * root sh /bigdata/shell/englishprice.sh
30 1,7,12,15 * * * root sh /bigdata/shell/englishcategory.sh
2 6 * * * root sh /bigdata/shell/englishtaskmrs.sh
50 2,15 * * * root sh /bigdata/shell/jmbitaskmr.sh
00 3 1 * *  /bin/sh  /bigdata/shell/logclear.sh
*/1 * * * * root  /etc/profile;/bin/sh /bigdata/shell/hymtest.sh >>/bigdata/shell/hymout.txt 2>&1



结果可以看到如下异常:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil
        at org.jumao.jdata.mrs.index.TopIndexMR.main(TopIndexMR.java:99)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 7 more



第四步:
对比手动shell的环境变量   env和 crontab运行时的环境变量,输出到文件
在hymtest.sh中加入

#!/bin/bash
source /etc/profile
env>>/bigdata/shell/hymout.txt
DATE=$(date +%Y%m%d:%H:%M:%S)
echo $DATE + "every minute test">>/bigdata/shell/hymoutput.txt
echo 导入每天指数涨跌排行数据 {存到:hbase:"jmdata:topIndex"}>>/bigdata/shell/hymoutput.txt
/opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop jar /bigdata/cdh/jmdata-jdata-mrs-index.jar org.jumao.jdata.mrs.index.TopIndexMR
echo "end topIndex MR">>/bigdata/shell/hymoutput.txt

手动env
HOSTNAME=nn1
TERM=xterm
SHELL=/bin/bash
HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
HISTSIZE=1000
SSH_CLIENT=172.18.203.112 49374 22
QTDIR=/usr/lib64/qt-3.3
QTINC=/usr/lib64/qt-3.3/include
SSH_TTY=/dev/pts/0
USER=root
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.tbz=01;31:*.tbz2=01;31:*.bz=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:
HBASE_HOME=/opt/cloudera/parcels/CDH/lib/hbase
HADOOP_COMMON_LIB_NATIVE_DIR=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native
MAIL=/var/spool/mail/root
PATH=/usr/java/jdk1.8.0_131/bin:/opt/cloudera/parcels/CDH/lib/hadoop/bin:/opt/cloudera/parcels/CDH/lib/hbase/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
PWD=/bigdata/shell
JAVA_HOME=/usr/java/jdk1.8.0_131
HADOOP_CLASSPATH=:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/hive/lib/*:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/hive-hcatalog/share/hcatalog/*
EDITOR=vi
LANG=en_US.UTF-8
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
LOGNAME=root
QTLIB=/usr/lib64/qt-3.3/lib
CVS_RSH=ssh
CLASSPATH=.:/usr/java/jdk1.8.0_131/lib/dt.jar:/usr/java/jdk1.8.0_131/lib/tools.jar
SSH_CONNECTION=172.18.203.112 49374 172.18.203.111 22
LESSOPEN=||/usr/bin/lesspipe.sh %s
G_BROKEN_FILENAMES=1
HIVE_CONF_DIR=/etc/hive/conf
_=/bin/env
OLDPWD=/root


对比发现关键的没有以下环境变量设置,导致找不到相关的class
HADOOP_CLASSPATH=:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/hive/lib/*:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/hive-hcatalog/share/hcatalog/*


第五步,修正,执行成功,在执行MR前导出环境变量,指定MR依赖的包,当然这些依赖的包也可以直接copy到集群每台hadoop的lib下,最初部署MR任务做法就是如此。
#!/bin/bash
source /etc/profile
export HADOOP_CLASSPATH=:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/hive/lib/*:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/hive-hcatalog/share/hcatalog/*

env>>/bigdata/shell/hymout.txt
DATE=$(date +%Y%m%d:%H:%M:%S)
echo $DATE + "every minute test">>/bigdata/shell/hymoutput.txt
echo 导入每天指数涨跌排行数据 {存到:hbase:"jmdata:topIndex"}>>/bigdata/shell/hymoutput.txt
/opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop jar /bigdata/cdh/jmdata-jdata-mrs-index.jar org.jumao.jdata.mrs.index.TopIndexMR
echo "end topIndex MR">>/bigdata/shell/hymoutput.txt

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics