Bug死磕之hue集成的oozie+pig出现资源任务死锁问题

qindongliang1922

浏览: 2147234 次
性别:
来自: 北京

最近访客更多访客>>

godandghost

youhere

tanss

fengshuo850420

博主相关

博客

微博

相册

留言

关于我

博客专栏

: 证道Lucene4
浏览量：116320

: 证道Hadoop
浏览量：124589

: 证道shell编程
浏览量：58453

: ELK修真
浏览量：70349

文章分类

社区版块

存档分类

博客分类：

Hadoop
Hive
Pig

hadoop pig oozie hue hive

这两天，打算给现有的Apache Hadoop2.7.1的集群装个hue，方便业务人员使用hue的可视化界面，来做一些数据分析任务，这过程遇到不少问题，不过大部分最终都一一击破，收获经验若干，折腾的过程，其实也是一个学习的过程，一个历练的过程，我相信优秀的人，都是经历过无数磨难成就的，并且有着坚持不放弃的心态，迎接各种挑战，如果你总是遇到困难就放弃，那么你永远也不能成为最优秀的人，废话不多说了，下面开始进入正题：

框架版本如下：
Centos6.5
Apache Hadoop2.7.1
Apache Hbase0.98.12
Apache Hive1.2.1
Apache Pig0.15.0
Apache oozie4.2.0
Apache Spark1.6.0
Cloudrea Hue3.8.1

（一）安装hue

1，到官网下载 http://gethue.com/category/release/
2，解压到某个目录hue
3，安装依赖：yum install -y asciidoc cyrus-sasl-devel cyrus-sasl-gssapi gcc gcc-c++ krb5-devel libtidy libxml2-devel libxslt-devel make mysql mysql-devel openldap-devel python-devel sqlite-devel openssl-devel gmp-devel
4，进入hue目录，运行make apps编译
5，配置desktop/conf/hue.ini
6，启动build/env/bin/supervisor 杀死脚本（ps -ef | grep hue- | gawk '{print $2}' | xargs kill -9）

如果你的hue只是用来操作hive的，那么没必要安装oozie了，那么整个流程就简单了，进入hive目录启动memstore和hiveserver2服务即可：
bin/hive --service metastore
bin/hiveserver2
一个使用hue查询的hive的查询如下：

很漂亮，用来调试sql非常方便，如果你想使用工作流或者pig功能，那么就需要安装oozie了
oozie安装编译比较坑，这里需要注意一下，目前最新的版本oozie的最新版本是4.2.0，但是依赖的
hive只支持0.13.1
hbase支持到0.94.2
spark，hadoop，pig均可支持当然最新版本
而且pom文件里面用到的codehash镜像已经失效，如果不去掉，则会编译失败：

 <repository>
            <id>Codehaus repository</id>
            <url>http://repository.codehaus.org/</url>
            <snapshots>
                <enabled>false</enabled>
            </snapshots>
        </repository>

（二）安装oozie
1，下载oozie
wget http://archive.apache.org/dist/oozie/4.2.0/oozie-4.2.0.tar.gz

2，解压至某个盘符，修改其根目录下的pom文件里面的pig（加载类是h2，代表hadoop2.x），hadoop，hbase，hive，spark等版本，hbase和hive使用最新的可能会编译失败，这一点需要注意，经测试hbase0.94.2和hive0.13.1可正常编译通过，

3，修改完毕后，执行编译
bin/mkdistro.sh -P hadoop-2 -DskipTests
或
mvn clean package assembly:single -P hadoop-2 -DskipTests
4，执行成功后，拷贝oozie-4.2.0/distro/target/oozie-4.2.0-distro.tar.gz 至安装目录，
具体请参考我的这篇文章：
http://qindongliang.iteye.com/blog/2212503

（三）在hue里面测试pig脚本：

写一个简单的pig脚本：

点击运行，发现oozie会启动两个任务，一个是launcher，一个pig脚本，lancher任务一直卡着95%进度，不再有任何变化，而主体的pig脚本，则一直在初始化阶段，不能被执行，看日志log无任何错误，就是一直打印
Heart beat
Heart beat
Heart beat
......

经过查资料，发现在集群小的时候，如果集群资源不充足，导致RM无法分配多个MR的资源，就会一直等待，然后整个任务就会处于假死状态，一直死锁不运行，其实就是多个MR任务，抢占资源，导致最后谁也运行不了造成的，如何解决？

方案一：
切换haodop集群默认使用的容量调度器为公平调度器，允许当前队列最多只能运行一个MR任务，多了就阻塞等待。
方案二：
切换haodop集群默认使用的容量调度器为公平调度器，创建多个队列把任务提交到不同的队列里面，避免资源抢占

 <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
  </property>


  <property>
  <name>yarn.scheduler.fair.allocation.file</name>
  <value>file:/%HADOOP_HOME%/etc/hadoop/fair-scheduler.xml</value>
</property>

fair-scheduler.xml配置：

<?xml version="1.0"?>
<allocations>
  <queue name="test">
    <minResources>1000 mb, 1 vcores</minResources>
    <maxResources>5000 mb, 1 vcores</maxResources>
    <maxRunningApps>1</maxRunningApps>
    <aclSubmitApps>webmaster</aclSubmitApps>
    <weight>2.0</weight>
    <schedulingPolicy>fair</schedulingPolicy>
    <queue name="test-sub">
        <aclSubmitApps>webmaster</aclSubmitApps>
        <minResources>500 mb, 1 vcores</minResources>
    </queue>
  </queue>
  <user name="root">
    <maxRunningApps>1</maxRunningApps>
  </user>
 <user name="webmaster">
    <maxRunningApps>1</maxRunningApps>
  </user>
<!--
  <user name="gpadmin">
    <maxRunningApps>5</maxRunningApps>
  </user>-->
  <userMaxAppsDefault>1</userMaxAppsDefault>
 <fairSharePreemptionTimeout>30</fairSharePreemptionTimeout>
</allocations>

关于hadoop的资源调度，请参考下面的链接：
https://support.pivotal.io/hc/en-us/articles/201999117-How-to-Configure-YARN-Capacity-Scheduler-on-a-PHD-Cluster

修改完成后同步分发所有的hadoop节点，并拷贝一份到oozie/conf/hadoop-conf/下面一份，重启hadoop集群和oozie服务，再次执行脚本，发现运行没有问题：

如果还想配置，solr，hbase，只要在hue.ini里面配置即可，注意hbase的服务，需要启动hbase的thrift端口才行，
bin/hbase-daemon.sh start thrift
然后在hue.ini里面配置： hbase_clusters=(Cluster|h1:9090)，必须是这种格式，否则hue不会识别