参考:
http://www.dataguru.cn/forum.php?mod=viewthread&tid=39857
http://blog.sina.com.cn/s/blog_701a48e7010189rc.html
http://www.chenjunlu.com/2012/12/trying-with-oracle-loader-for-hadoop/
http://f.dataguru.cn/thread-39092-1-1.html
os:rhel-server-5.4-x86_64-dvd
jdk版本为:jdk-6u31-linux-i586-rpm.bin
Hadoop版本为:hadoop-0.20.2.tar.gz hadoop-0.20.2-CDH3B4.tar
oraloader版本为:oraloader-2.0.0
oracle的版本为:Version 11.2.0.1.0
JDBC:ojdbc6.jar
解压:oraloader
Oracle大数据OLH连接
有二个文件
查看 README.TXT
所以我们要用CHD 的版本hadoop 现有的版本:hadoop-0.20.2-CDH3B4.tar
解压oraloader
oraloader-2.0.0-1
修改环境变量
[root@hadoop64 home]# vi /etc/profile
配置 Oracle 相关驱动包
ojdbc6.jar 包在/hadoop/oraloader-2.0.0-1/jlib里面有一个
[cloud@hadoop64 jlib]$ cp ojdbc6.jar /hadoop/hadoop-0.20.2/lib
启动hadoop集群
创建目标表
create table tab_lx(id number,name varchar2(50),address varchar2(100));
在Hadoop中放入示例文件
[cloud@hadoop64 bin]$ cat data.dat
1,zzg,湖南
2,zjy,北京
3,ldh,美国
拷贝国际化文件
cp /hadoop/oraloader-2.0.0-1/jlib/orai18n* /hadoop/hadoop-0.20.2/lib
不然报错:
配置 Oracle Loader for Hadoop
[cloud@hadoop64 bin]$ vi MyConf.xml
<?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>mapred.input.dir</name> <value>olh_exercise1_in/data.dat</value> </property> <property> <name>mapreduce.inputformat.class</name> <value>oracle.hadoop.loader.lib.input.DelimitedTextInputFormat</value> </property> <property> <name>mapreduce.outputformat.class</name> <value>oracle.hadoop.loader.lib.output.JDBCOutputFormat</value> </property> <property> <name>mapred.output.dir</name> <value>olh_exercise1_out</value> </property> <property> <name>oracle.hadoop.loader.connection.url</name> <value>jdbc:oracle:thin:@192.168.80.70:1521/hadoop64</value> </property> <property> <name>oracle.hadoop.loader.connection.user</name> <value>SCOTT</value> </property> <property> <name>oracle.hadoop.loader.connection.password</name> <value>oracle</value> </property> <property> <name>oracle.hadoop.loader.loaderMapFile</name> <value>file:////hadoop/hadoop-0.20.2/bin/loaderMap_exercise1.xml</value> </property> </configuration> ~
[cloud@hadoop64 bin]$ vi loaderMap_exercise1.xml
<?xml version="1.0" encoding="UTF-8" ?> <LOADER_MAP> <SCHEMA>SCOTT</SCHEMA> <TABLE>tab_lx</TABLE> <COLUMN field="F0">id</COLUMN> <COLUMN field="F1">name</COLUMN> <COLUMN field="F2">address</COLUMN> </LOADER_MAP>
运行
[cloud@hadoop64 bin]$ hadoop jar ${OLH_HOME}/jlib/oraloader.jar oracle.hadoop.loader.OraLoader -conf MyConf.xml
Oracle Loader for Hadoop Release 2.0.0 - Production
Copyright (c) 2011, 2012, Oracle and/or its affiliates. All rights reserved.
13/02/01 09:34:25 INFO loader.OraLoader: Oracle Loader for Hadoop Release 2.0.0 - Production
Copyright (c) 2011, 2012, Oracle and/or its affiliates. All rights reserved.
13/02/01 09:34:25 INFO loader.OraLoader: Built-Against: not available
13/02/01 09:34:28 INFO loader.OraLoader: oracle.hadoop.loader.loadByPartition is disabled because table: TAB_LX is not partitioned
13/02/01 09:34:28 INFO loader.OraLoader: oracle.hadoop.loader.enableSorting disabled, no sorting key provided
13/02/01 09:34:28 INFO output.DBOutputFormat: Setting reduce tasks speculative execution to false for : oracle.hadoop.loader.lib.output.JDBCOutputFormat
13/02/01 09:34:28 WARN loader.OraLoader: Sampler error: the number of reduce tasks must be greater than one; the configured value is 1 . Job will continue without sampled information.
13/02/01 09:34:28 INFO loader.OraLoader: Sampling time=0D:0h:0m:0s:14ms (14 ms)
13/02/01 09:34:28 INFO loader.OraLoader: Submitting OraLoader job OraLoader
13/02/01 09:34:29 INFO input.FileInputFormat: Total input paths to process : 1
13/02/01 09:34:30 INFO loader.OraLoader: map 0% reduce 0%
13/02/01 09:34:38 INFO loader.OraLoader: map 100% reduce 0%
13/02/01 09:34:50 INFO loader.OraLoader: map 100% reduce 100%
13/02/01 09:34:52 INFO loader.OraLoader: Job complete: OraLoader (null)
13/02/01 09:34:52 INFO loader.OraLoader: Counters: 17
FileSystemCounters
FILE_BYTES_READ=114
FILE_BYTES_WRITTEN=260
HDFS_BYTES_READ=39
HDFS_BYTES_WRITTEN=1858
Job Counters
Data-local map tasks=1
Launched map tasks=1
Launched reduce tasks=1
Map-Reduce Framework
Combine input records=0
Combine output records=0
Map input records=3
Map output bytes=102
Map output records=3
Reduce input groups=1
Reduce input records=3
Reduce output records=3
Reduce shuffle bytes=0
Spilled Records=6
测试ORACLE
相关推荐
大数据中数据连接ORACLE数据库到HADOOP的实战案例。 Use Data from a Hadoop Cluster with Oracle Database
xi31_sp2_designer_olh_zh_CN xi31_sp2_designer_olh_zh_CN xi31_sp2_designer_olh_zh_CN xi31_sp2_designer_olh_zh_CN xi31_sp2_designer_olh_zh_CN
xi31_designer_olh_zh_CN xi31_designer_olh_zh_CN xi31_designer_olh_zh_CN xi31_designer_olh_zh_CN xi31_designer_olh_zh_CN
pure-LDP是一个Python软件包,提供了各种最新LDP算法(“频率Oracle”和“重磅炸弹”)的简单实现,其主要目标是提供一个简单的接口来对这些算法进行基准测试和实验。 Wang等人在一文中详细介绍了pure-LDP,它最初...
规约OLH 频率Oracle(原始估计直方图) 相关文章:方波密度Oracle(用于数字/标准值) 相关文章:澄清:引用33应该是Ning Wang等。 收集和分析具有局部差异隐私的多维数据。 ICDE 2019。SVSM LDP下的频繁项集挖掘...
本压缩包包含Juniper ScreenOS的以下相关文档 630_ce_all.pdf Concepts & Examples ScreenOS Reference Guide 630_ipv4_cli.pdf ScreenOS Reference Guide:IPv4 Command Descriptions ...OLH-6.3.0.zip Online Help
背景知识介绍夵估计误差——方差:夽 n · d − 夲 夫 eϵ夨eϵ − 失天2夨头天二、OLH(Optimized Local Hashing) 为了改进奇
背景知识介绍夵估计误差——方差:夽 n · d − 夲 夫 eϵ夨eϵ − 失天2夨头天二、OLH(Optimized Local Hashing) 为了改进奇
JDK API 1.6.0 中文版.zip