We are using Sqoop to export data from the hive to SQL Server. The new data is always appended to the existing data in SQL Server.
Is it possible to truncate the SQL Server table via Sqoop before starting the export?
We are using Sqoop to export data from the hive to SQL Server. The new data is always appended to the existing data in SQL Server.
Is it possible to truncate the SQL Server table via Sqoop before starting the export?
sqoop eval --connect 'jdbc:sqlserver://1.1.1.1;database=SomeDatabase;username=someUser;password=somePassword' --query "TRUNCATE TABLE some_table"
In hue, I create a sqoop action likes
<workflow-app name="inokmobile-delete-before-export" xmlns="uri:oozie:workflow:0.4"> <start to="inokmobile-delete-before-export"/> <action name="inokmobile-delete-before-export"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <arg> eval</arg> <arg>--connect</arg> <arg>jdbc:mysql://192.168.10.200:3306/new_inoknok</arg> <arg>--username</arg> <arg>dba</arg> <arg>--password</arg> <arg>123456</arg> <arg>--query</arg> <arg>TRUNCATE TABLE ${tablename}</arg> </sqoop> <ok to="end"/> <error to="kill"/> </action> <kill name="kill"> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"/> </workflow-app>
The table is truncated correctly. But the job is killed due to erro
IllegalArgumentException: JobId string : is not properly formed
I thought the right way is to write a java procedure to execute this functionality.
package com.inoknok.mysql.tool; import java.sql.*; import java.util.ArrayList; import java.util.List; public class TruncateMysqlTable { private static final String MYSQL_DRIVER = "com.mysql.jdbc.Driver"; private static final String sql_prefix = "TRUNCATE TABLE "; public static void main(String[] args) { int argsLength = args.length; if (argsLength < 4) { System.out .println("Please input parameters: url, user,password, tableTruncated"); System.exit(-1); } String url = args[0]; String user = args[1]; String password = args[2]; List<String> tables = new ArrayList<String>(); for (int i = 3; i < argsLength; i++) { tables.add(args[i]); } try { Class.forName(MYSQL_DRIVER); Connection conn = DriverManager.getConnection(url, user, password); if (!conn.isClosed()) { Statement statement = conn.createStatement(); for (String table : tables) { String sql = sql_prefix + table; statement.execute(sql); } closeResources(conn, statement); } } catch (Exception e) { System.out.println("Fail to truncate table !"); e.printStackTrace(); } } public static void closeResources(Connection conn, Statement pstmt, ResultSet rs) { if (null != rs) { try { rs.close(); } catch (SQLException e) { e.printStackTrace(); throw new RuntimeException(e); } finally { if (null != pstmt) { try { pstmt.close(); } catch (SQLException e) { e.printStackTrace(); throw new RuntimeException(e); } finally { if (null != conn) { try { conn.close(); } catch (SQLException e) { e.printStackTrace(); throw new RuntimeException(e); } } } } } } } public static void closeResources(Connection conn, Statement pstmt) { if (null != pstmt) { try { pstmt.close(); } catch (SQLException e) { e.printStackTrace(); throw new RuntimeException(e); } finally { if (null != conn) { try { conn.close(); } catch (SQLException e) { e.printStackTrace(); throw new RuntimeException(e); } } } } } }
References
http://stackoverflow.com/questions/15808257/sqoop-truncate-sql-server-table-before-exporting-data-from-hadoop
相关推荐
运行Sqoop报错:找不到或无法加载主类 org.apache.sqoop.sqoop 将sqoop-1.4.7.jar包放到Sqoop的lib目录下,问题解决。
sqoop-1.4.6.jar,sqoop报错找不到加载类的时候 将该文件放入sqoop的lib文件夹下
编译Atlas用 sqoop-1.4.6.2.3.99.0-195.jar 内含安装jar包以及maven手动安装命令 详情可参考我的博客: https://blog.csdn.net/qq_26502245/article/details/108008070
sqoop-1.4.6.2.3.99.0-195.jar org.restlet-2.4.3.jar org.restlet.ext.servlet-2.4.3.jar
docker run -v /path/to/jdbc-jars:/jdbc -it dvoros/sqoop:latest MySQL导入示例 MYSQL_HOST= MYSQL_USER= MYSQL_PASS= MYSQL_DB= MYSQL_TABLE= sqoop import --connect jdbc:mysql://$MYSQL_HOST/$MYSQL_DB --...
sqoop框架开发工具使用的jar sqoop-1.4.7.jar 手动安装到maven <groupId>org.apache.sqoop <artifactId>sqoop <version>1.4.7 </dependency>
使用Sqoop在Oracle数据库和HDFS之间进行导入和导出 内容 使用 操作系统-Ubuntu 20.04.1 LTS(VMware) 后端-Java (JDK 1.8), Sqoop(v1.4.7) , Hadoop(v3.3.0) 库-OJDBC(v7),commons-lang-2.6.jar ...
sqoop是一种导入导出数据的工具,这里用思维导图的形式展现了sqoop的一些简单应用。
sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException,没遇到可以跳过 19/09/20 09:57:47 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException at org.json...
Sqoop允许在数据库和HDFS之间轻松导入和导出数据集。更多文件Sqoop附带其他文档:用户指南和手册页。 两者的asciidoc来源都在src/docs/ 。 运行ant docs以构建文档。 它将在build/docs/创建。 如果您以发布形式获得...
sqoop 导入数据时候报错ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: oracle.jdbc.OracleDriver 缺少驱动包。
java连接sqoop源码Apache Sqoop 目录 #Getting Started下载并安装 Sqoop 注意:选择合适的版本,不要使用 sqoop2 因为它不是正式的 GA 并且可能永远不会 $ wget ...
sqoop导入数据到hdfs,所有相关的操作命令均在更改相关地址就行。
Use Sqoop and Apache Flume to ingest data from relational databases Program complex Hadoop and Spark applications with Apache Pig and Spark DataFrames Perform machine learning techniques such as ...
Sqoop(发音:skup)是一款开源的工具,主要用于在Hadoop(Hive)与传统的数据库(mysql、postgresql...)间进行数据的传递,可以将一个关系型数据库(例如 : MySQL ,Oracle ,Postgres等)中的数据导进到Hadoop的HDFS中,...
其中包含Sqoop将SqlServer文件导入HDFS文件的所有jar包
Sqoop 有关完整文档,请参阅 它是什么 该项目为 sqoop 用户提供了两个功能,包括: 扫描器 工人 Scanner将通过扫描所需的源文件将原始类型的配置文件传输到即用型配置文件。 Util 现在,我们只支持csv格式的源文件...
写在前面 我在直接下载https://archive.cloudera.com/cdh5/cdh/5/sqoop-1.4.6-cdh5.7.0.tar.gz 解压使用会出现几个bug,之前也是出现了,然后也没有来...这是因为sqoop缺少java-json.jar包 下载java-json.jar包: http
1.实现db到hadoop hdfs数据导入功能,提供高效的分布式并行处理能力,可以采用数据库分区、按字段分区、数据库分区结合字段分区、分页(不建议)四种方式并行批处理抽取db数据到hdfs文件系统中;2.能有效解决按字段...
NULL 博文链接:https://ylzhj02.iteye.com/blog/2051729