MySQL Applier for Hadoop
Replication via the Hadoop Applier is implemented by connecting to the MySQL master and reading binary log events as soon as they are committed, and writing them into a file in HDFS. "Events" describe database changes such as table creation operations or changes to table data.
The Hadoop Applier uses an API provided by libhdfs, a C library to manipulate files in HDFS. The library comes precompiled with Hadoop distributions.
It connects to the MySQL master to read the binary log and then:
- Fetches the row insert events occurring on the master
- Decodes these events, extracts data inserted into each field of the row, and uses content handlers to get it in the format required
- Appends it to a text file in HDFS.
Databases are mapped as separate directories, with their tables mapped as sub-directories with a Hive data warehouse directory. Data inserted into each table is written into text files (named as datafile1.txt) in Hive / HDFS. Data can be in comma separated format; or any other, that is configurable by command line arguments.
download from http://labs.mysql.com/
Preferences
http://dev.mysql.com/tech-resources/articles/mysql-hadoop-applier.html
http://www.tuicool.com/articles/NfArA3i
a similar project is https://github.com/noplay/python-mysql-replication
相关推荐
It will cover real-time use case scenarios to explain integration and achieving Big Data solutions using different technologies such as Apache Hadoop, Apache Sqoop, and MySQL Applier. The book will ...
MySQL Kafka应用程序 用于kafka的mysql realtime-binlog 要求 MySQL Binlog事件1.0.0 librdkafka MySQL 5.7.X(二进制和源代码) 安装 跑步
库伯应用程序 kube-applier是一项服务,可通过将声明性配置文件从Git存储库应用到Kubernetes集群,从而实现Kubernetes对象的连续部署。 kube-applier在您的集群中作为Pod运行,并监视以确保集群对象及其存储库中的...
组件层:主要包括3个特定组件,Capture负责收集事务执行的相关信息,Applier负责应用集群事务到本地,Recovery负责节点的数据恢复。 复制层:负责冲突验证,接收和应用集群事务。 集群通信层:基于Paxos协议的集群...
官方离线安装包,测试可用。使用rpm -ivh [rpm完整包名] 进行安装
官方离线安装包,测试可用。使用rpm -ivh [rpm完整包名] 进行安装
官方离线安装包,测试可用。使用rpm -ivh [rpm完整包名] 进行安装
官方离线安装包,测试可用。使用rpm -ivh [rpm完整包名] 进行安装
官方离线安装包,测试可用。使用rpm -ivh [rpm完整包名] 进行安装
官方离线安装包,测试可用。使用rpm -ivh [rpm完整包名] 进行安装
官方离线安装包,测试可用。使用rpm -ivh [rpm完整包名] 进行安装
官方离线安装包,测试可用。使用rpm -ivh [rpm完整包名] 进行安装
官方离线安装包,测试可用。使用rpm -ivh [rpm完整包名] 进行安装
官方离线安装包,亲测可用
官方离线安装包,亲测可用
官方离线安装包,亲测可用
官方离线安装包,测试可用。使用rpm -ivh [rpm完整包名] 进行安装
官方离线安装包,测试可用。使用rpm -ivh [rpm完整包名] 进行安装
官方离线安装包,测试可用。使用rpm -ivh [rpm完整包名] 进行安装
一系列脚本和程序,用于根据源代码文档中的作者标签自动应用有关软件许可(例如 GPL)的复制许可和免责声明。 这消除了繁琐的复制粘贴工作。