I create a HCat table using Hue metastore manager, and submit a sqoop job with hcat through hue, the command show as blow:
import --connect jdbc:mysql://192.168.122.1:3306/sample --username zhj
--password 123456 --table sample_user --split-by user_id -m 2
--hcatalog-table sample_raw.sample_user
Error:
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain],
main() threw exception, org/apache/hcatalog/mapreduce/HCatOutputFormat
java.lang.NoClassDefFoundError: org/apache/hcatalog/mapreduce/HCatOutputFormat
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain],
main() threw exception, org/apache/hadoop/hive/conf/HiveConf$ConfVars
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf$ConfVars
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain],
main() threw exception, java.lang.NoClassDefFoundError: javax/jdo/JDOException
com.google.common.util.concurrent.ExecutionError:
java.lang.NoClassDefFoundError: javax/jdo/JDOException
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2232)
....
Caused by: java.lang.ClassNotFoundException: javax.jdo.JDOException
similars :
https://issues.apache.org/jira/browse/HCATALOG-380
https://issues.apache.org/jira/browse/PIG-2666
java.lang.ClassNotFoundException: org.datanucleus.api.jdo.JDOPersistenceManagerFactory
at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1168)
A:
a. copy all jars in hive-0.12.0-bin/hcatalog/share/hcatalog to oozie-4.0.1/share/lib/sqoop
this will solve the first error,but the second error comes after the first one disapare.
I think two ways to fix the second one.
1)change hadoop-env.sh adding HADOOP_CLASSPATH pointing to hcatalog jars which are copyed to cluster nodes from local dir oozie-4.0.1/share/lib/hcatalog. But fails.
2.copy all jars in share/lib/hcatalog to share/lib/sqoop and upgrade it to hdfs sharelib. This way solve the second error but lead to the third one. Fuck!
3. copy all jars in shar/lib/hive/ to share/lib/sqoop/ and upgrade sharelib
//till now, the first three errors solved, but produce the fourth one
4. The fourth error is due to version problem
"datanucleus-api-jdo-3.0.0-release.jar" does NOT contain
org.datanucleus.jdo.JDOPersistenceManagerFactory.
It contains "org.datanucleus.api.jdo.JDOPersistenceManagerFactory".
I find that in oozie-4.0.1
./share/lib/sqoop/datanucleus-rdbms-2.0.3.jar
./share/lib/sqoop/datanucleus-connectionpool-2.0.3.jar
./share/lib/sqoop/datanucleus-core-2.0.3.jar
./share/lib/sqoop/datanucleus-enhancer-2.0.3.jar
./share/lib/hive/datanucleus-rdbms-2.0.3.jar
./share/lib/hive/datanucleus-connectionpool-2.0.3.jar
./share/lib/hive/datanucleus-core-2.0.3.jar
./share/lib/hive/datanucleus-enhancer-2.0.3.jar
and in hive-0.12.0-bin
./lib/datanucleus-core-3.2.2.jar
./lib/datanucleus-rdbms-3.2.1.jar
./lib/datanucleus-api-jdo-3.2.1.ja
cp datanucleus-core-3.2.2.jar datanucleus-rdbms-3.2.1.jar datanucleus-api-jdo-3.2.1.ja to share/lib/sqoop
and upgrade sharelib.
Note: when compile oozie, the hive version is not 0.12.0, so lead to these errors.
http://stackoverflow.com/questions/11494267/class-org-datanucleus-jdo-jdopersistencemanagerfactory-was-not-found
******************************************************************************
0. retain the original jars in share/lib/sqoop
1. copy all jars in hive-0.12.0-bin/hcatalog/share/hcatalog to oozie-4.0.1/share/lib/sqoop
2. copy all jars in hive-0.12.0-bin/lib/ to oozie-4.0.1/share/lib/sqoop
3. copy sqoop-1.4.4.bin__hadoop-2.0.4-alpha/sqoop-1.4.4.jar to oozie-4.0.1/share/lib/sqoop
4. update sharelib
All errors above listed disappers. But the new one comes
java.io.IOException: NoSuchObjectException(message:inok_datamine.inok_user table not found)
Good news: I set the entity in hive-site.xml
<name>hive.metastore.uris</name>
<value>thrift://192.168.122.1:9083</value>
and upload it to hdfs hive/hive-site.xml meanwhile add it to the sqoop job in hue.
start metastore by
hive --service metastore //default port is 9083
hive --service metastore -p <port_num>
But the last error maybe comes
32064 [main] INFO org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities - HCatalog full table schema fields = [user_id, user_name, first_letter, live_city]
33238 [main] INFO org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities - HCatalog table partitioning key fields = []
33241 [main] ERROR org.apache.sqoop.Sqoop - Got exception running Sqoop: java.lang.NullPointerException
Intercepting System.exit(1)
details
java.lang.NullPointerException
at org.apache.hcatalog.data.schema.HCatSchema.get(HCatSchema.java:99)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureHCat(SqoopHCatUtilities.java:344)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureImportOutputFormat(SqoopHCatUtilities.java:658)
at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:98)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:232)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:600)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:413)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:502)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:203)
at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:172)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:45)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Intercepting System.exit(1)
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]
import --connect jdbc:mysql://192.168.122.1:3306/sample --username zhj
--password 123456 --table sample_user --split-by user_id -m 2
--hive-database sample_raw --hive-table sample_user --hive-import
**********************************************************************************
The entity in hive-site.xml shown below is confused me.
<!-- HCatAccessorService -->
<property>
<name>oozie.service.HCatAccessorService.jmsconnections</name>
<value>
default=java.naming.factory.initial#org.apache.activemq.jndi.ActiveMQInitialContextFactory;java.naming.provider.url#tcp://localhost:61616;connectionFactoryNames#ConnectionFactory
</value>
<description>
Specify the map of endpoints to JMS configuration properties. In general, endpoint
identifies the HCatalog server URL. "default" is used if no endpoint is mentioned
in the query. If some JMS property is not defined, the system will use the property
defined jndi.properties. jndi.properties files is retrieved from the application classpath.
Mapping rules can also be provided for mapping Hcatalog servers to corresponding JMS providers.
hcat://${1}.${2}.server.com:8020=java.naming.factory.initial#Dummy.Factory;java.naming.provider.url#tcp://broker.${2}:61616
</description>
</property>
see:http://stackoverflow.com/questions/11494267/class-org-datanucleus-jdo-jdopersistencemanagerfactory-was-not-foundhttp://stackoverflow.com/questions/11494267/class-org-datanucleus-jdo-jdopersistencemanagerfactory-was-not-found
The similar problem with pig using hcatalog appears. see:http://ylzhj02.iteye.com/admin/blogs/2043781
NOTE:With the support for HCatalog added to Sqoop, any HCatalog job depends on a set of jar files being available both on the Sqoop client host and where the Map/Reduce tasks run. To run HCatalog jobs, the environment variable HADOOP_CLASSPATH
must be set up as shown below before launching the Sqoop HCatalog jobs.
HADOOP_CLASSPATH=$(hcat -classpath)
export HADOOP_CLASSPATH
The necessary HCatalog dependencies will be copied to the distributed cache automatically by the Sqoop job.
I add the above two lines to ~/.bashrc and hive-0.12.0-bin/conf/hive-env.sh, but not works.
-----------------
NoSuchObjectException(message:default.'inok_datamine.inok_user' table not found)
my sqoop script command likes:
--hcatalog-table 'inok_datamine.inok_user'
the above script miss --hcatalog-database. Correct scipt is:
Reference
official docments
http://gethue.com/hadoop-tutorial-how-to-access-hive-in-pig-with/
https://cwiki.apache.org/confluence/display/Hive/HCatalog
http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/
http://www.micmiu.com/bigdata/sqoop/sqoop-setup-and-demo/
相关推荐
sqoop-connector-generic-jdbc-1.99.7.jar sqoop-connector-generic-jdbc-1.99.7.jar
运行Sqoop报错:找不到或无法加载主类 org.apache.sqoop.sqoop 将sqoop-1.4.7.jar包放到Sqoop的lib目录下,问题解决。
Sqoop(发音:skup)是一款开源的工具,主要用于在Hadoop(Hive)与传统的数据库(mysql、postgresql...)间进行数据的传递,可以将一个关系型数据库(例如 : MySQL ,Oracle ,Postgres等)中的数据导进到Hadoop的HDFS中,...
sqoop从phoenix抽取数据到hdfs sqoop import \ --driver org.apache.phoenix.jdbc.PhoenixDriver \ --connect jdbc:phoenix:192.168.111.45:2181 \ ...--num-mappers 1 \ --direct \ --fields-terminated-by ','
sqoop-1.4.6-hadoop-2.6最小资源包已经经过严格测试,
mv /usr/local/sqoop-1.4.6-cdh5.13.2/conf/sqoop-env.template.sh /usr/local/sqoop-1.4.6-cdh5.13.2/conf/sqoop-env.sh vi /usr/local/sqoop-1.4.6-cdh5.13.2/conf/sqoop-env.sh export HADOOP_COMMON_HOME=/usr/...
sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz /srv/ $ cd /srv $ sudo tar -xvf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz $ sudo chown -R hadoop:hadoop sqoop-1.4.6.bin__hadoop-2.0.4-alpha $ sudo ln -s $...
叶梓老师整理的Hadoop2.2.0+Hbase0.98.4+sqoop-1.4.4+hive-0.98.1安装手册,非常实用
sqoop对数据进行加工传输,有这丰富的sql语法,嵌套到python中,再使用airflow 很方便的做到自动化的数据处理
java连接sqoop源码Sqoop-服务 Sqoop scala 驱动程序,带有 mysql 元存储、光滑的数据库和喷雾。 Sqoop 是一个很好的工具,用于在 HDFS 中导入和导出数据。 大多数 sqoop 作业都是通过脚本编写的,这对于临时作业来说...
其中包含Sqoop将SqlServer文件导入HDFS文件的所有jar包
sqoop框架开发工具使用的jar,目前版本有1.4.6和1.4.7两个jar包
sqoop学习文档(2){Sqoop import、Sqoop export}。记录我的学习之旅,每份文档倾心倾力,带我成我大牛,回头观望满脸笑意,望大家多多给予意见,有问题或错误,请联系 我将及时改正;借鉴文章标明出处,谢谢
2、sqoop导入(RMDB-mysql、sybase到HDFS-hive) 网址:https://blog.csdn.net/chenwewi520feng/article/details/130572275 介绍sqoop从关系型数据库mysql、sybase同步到hdfs、hive中
编译Atlas用 sqoop-1.4.6.2.3.99.0-195.jar 内含安装jar包以及maven手动安装命令 详情可参考我的博客: https://blog.csdn.net/qq_26502245/article/details/108008070
NULL 博文链接:https://ylzhj02.iteye.com/blog/2051729
把这个sqoop-1.4.7.jar放到sqoop根目录下的lib目录中,即可。 如果你没有积分,也可以自己去这个地址下载:...
Apache Sqoop Docker映像 注意:这是master分支-对于特定的Sqoop版本,请...sqoop import --connect jdbc:mysql://$MYSQL_HOST/$MYSQL_DB --table $MYSQL_TABLE --username $MYSQL_USER --password $MYSQL_PASS -m 1
sqoop-1.4.6.2.3.99.0-195.jar org.restlet-2.4.3.jar org.restlet.ext.servlet-2.4.3.jar