最近研究了一下hive与hbase的整合,使用的都是各自的最新release, hive0.13, hbase0.96.2,整合的过程其实挺简单的,大致需要注意的地方如下:
1. hive的配置文件hive-site.xml需要添加的内容:
<property>
<name>hive.aux.jars.path</name>
<value>file:///home/grid/hive/lib/hive-hbase-handler-0.13.0.jar,file:///home/grid/hive/lib/hbase-client-0.96.2-hadoop2.jar,file:///home/grid/hive/lib/hbase-common-0.96.2-hadoop2.jar,file:///home/grid/hive/lib/hbase-common-0.96.2-hadoop2-tests.jar,file:///home/grid/hive/lib/hbase-protocol-0.96.2-hadoop2.jar,file:///home/grid/hive/lib/hbase-server-0.96.2-hadoop2.jar,file:///home/grid/hive/lib/htrace-core-2.04.jar,file:///home/grid/hive/lib/zookeeper-3.4.6.jar,file:///home/grid/hive/lib/protobuf-java-2.5.0.jar,file:///home/grid/hive/lib/guava-11.0.2.jar</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>server1,server2</value>
</property>
2.安装hbase的过程不再累述,可以参考http://blog.csdn.net/codestinity/article/details/6947464
最后在双方查询都能查到数据,但是在hive中向hbase插入数据的时候出现了问题,错误信息如下:
java.io.FileNotFoundException: File does not exist: hdfs://*.*.*.*:9000/home/grid/hbase/lib/hbase-hadoop-compat-0.96.2-hadoop2.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1110)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:264)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:300)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:387)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: hdfs://*.*.*.*:9000/home/grid/hbase/lib/hbase-hadoop-compat-0.96.2-hadoop2.jar)'
Execution failed with exit status: 1
Obtaining error information
Task failed!
Task ID:
Stage-0
Logs:
/tmp/root/hive.log
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
这是需要将使用的jar包上传到hdfs文件系统中,所以,需要哪个就上传哪个吧,用put命令。
最后成功实现hive与hbase的整合。
相关推荐
Hadoop2.2.0+Hbase0.98.1+Sqoop1.4.4+Hive0.13完全安装手册,经测试运行正常。
Hadoop-2.2.0+Hbase-0.96.2+Hive-0.13.1分布式整合,Hadoop-2.X使用HA方式
Hadoop2.2.0+HBase.96+Hive0.12简单集群环境的搭建
本手册主要整理了Hadoop2.2.0的安装,包括QJM实现HA的详细配置,以及Hdfs-site的配置详解。 不包括Hbase、Hive安装
叶梓老师整理的Hadoop2.2.0+Hbase0.98.4+sqoop-1.4.4+hive-0.98.1安装手册,非常实用
资源描述:妳那伊抹微笑_云计算之Hadoop-2.2.0+Hbaase-0.96.2 +Hive-0.13.1完全分布式环境整合安装文档V1.0.0.docx 博客地址:http://blog.csdn.net/u012185296 技术方向:Flume+Kafka+Storm+Redis/Hbase+Hadoop+...
Hadoop2.2+Zookeeper3.4.5+HBase0.96集群环境搭建
用于生产环境的hadoop2.2.0和hbase0.96.2、hive0.12的集成安装 经过测试环境
Hadoop集群安装,配置集群,可以配置Hadoop2.2.0+Hbase0.96+hive0.13,这是最佳组合,其他的不行
4,Hadoop2.2.0 5,Hbase0.96.2 6,Zookeeper3.4.5 7,Hive0.13.13 全是Java有关的框架,主要目的在于安装Hadoop,其他的都是附带的基本配置 本脚本能够快速在Linux上安装JAVA环境,并部署hadoop,其中关于hadoop,...
3.Hbase - 0.96.2-hadoop2 4.Hadoop - 2.2.0 5.hive - 0.13.1 您总是可以从 conf/lib jar 文件中找到版本。 Jar 文件有版本号。 另请参阅 pom.xml 以查看此项目中使用的版本。 ===============================...
hadoop hbase 权威指南等参考书 内附对应配置文件 hive eclipse插件(hadoop-eclipse-plugin-2.2.0.jar)
基于最新版本的湖仓一体、流批一体架构方案 hadoop-3.3.4+tez-0.10.2+hive-3.1.3+hbase-2.4.14+atlas-2.2.0+kafka- 2.8.2+ranger-2.3.0+flink-1.15.2+spark-3.3.0+hudi-0.12.1.jar+iceberg-0.14.1.jar+streamx
包括spark、mllib、hadoop、hive、hbase、solr、redis、memcache、elasticSearch、jdbc、mongodb、http、ftp、xml、csv、json等。 建筑学 要求 JDK 1.8 Scala-2.11.8 Apache Maven 3.1.0 或更新版本 Spark-2.1.0...
HADOOP.VERSION = 2.7.2 HBASE.VERSION = 1.2.6 HIVE.VERSION = 2.3.0 JDK.VERSION = 1.8.0_131 KAFKA.VERSION = 1.0.0 SCALA.VERSION = 2.11.8 SPARK.VERSION = 2.2.0 ZOOKEEPER.VESION = 3.4.10 ELASTICSEARCH....