1. download source code
#git clone https://git-wip-us.apache.org/repos/asf/flume.git
2. compile
#export MAVEN_OPTS="-Xms512m -Xmx1024m -XX:PermSize=256m -XX:MaxPermSize=512m"
#mvn clean install -DskipTests
an error comes
see:
https://issues.apache.org/jira/browse/FLUME-2184
solution:
#mvn clean install -Dhadoop.profile=2 -DskipTests
#mvn clean install -Dhadoop.profile=2 -DskipTests -Dmaven.test.skip=true
the last para will compile test classes.(In some situtation, test clasess have some errors)
If you want to change the hadoop version, alter the pom.xml
<hadoop2.version>2.6.0</hadoop2.version>
3.run
#cd flume-ng-dist/target/apache-flume-1.6.0-SNAPSHOT-bin
#cp conf/flume-conf.properties.template conf/flume.conf
#cp conf/flume-env.sh.template conf/flume-env.sh
Copy and paste this into conf/flume.conf
:
# Define a memory channel called ch1 on agent1 agent1.channels.ch1.type = memory # Define an Avro source called avro-source1 on agent1 and tell it # to bind to 0.0.0.0:41414. Connect it to channel ch1. agent1.sources.avro-source1.channels = ch1 agent1.sources.avro-source1.type = avro agent1.sources.avro-source1.bind = 0.0.0.0 agent1.sources.avro-source1.port = 41414 # Define a logger sink that simply logs all events it receives # and connect it to the other end of the same channel. agent1.sinks.log-sink1.channel = ch1 agent1.sinks.log-sink1.type = logger # Finally, now that we've defined all of our components, tell # agent1 which ones we want to activate. agent1.channels = ch1 agent1.sources = avro-source1 agent1.sinks = log-sink1
#bin/flume-ng agent --conf ./conf/ -f conf/flume.conf
-Dflume.root.logger=DEBUG,console -n agent1
#bin/flume-ng avro-client --conf conf -H localhost -p
41414
-F /etc/passwd -Dflume.root.logger=DEBUG,console
Error: the avro-client can't work(not read file and sent data to avro souce). when i shutdown the flume agent in other console, the avro-client has a error shows
2014-12-31 11:33:30,865 DEBUG [org.apache.avro.ipc.NettyTransceiver] - Remote peer dmining05/127.0.0.1:41414 closed connection. 2014-12-31 11:33:30,865 DEBUG [org.apache.avro.ipc.NettyTransceiver] - Disconnecting from dmining05/127.0.0.1:41414 Exception in thread "main" java.lang.NoSuchMethodError: org.apache.avro.specific.SpecificData.getClassLoader()Ljava/lang/ClassLoader; at org.apache.avro.ipc.specific.SpecificRequestor.getClient(SpecificRequestor.java:158) at org.apache.avro.ipc.specific.SpecificRequestor.getClient(SpecificRequestor.java:148) at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:171) at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:121) at org.apache.flume.api.NettyAvroRpcClient.configure(NettyAvroRpcClient.java:638) at org.apache.flume.api.RpcClientFactory.getDefaultInstance(RpcClientFactory.java:170) at org.apache.flume.client.avro.AvroCLIClient.run(AvroCLIClient.java:198) at org.apache.flume.client.avro.AvroCLIClient.main(AvroCLIClient.java:72)
Solution: the reason is that avro version, replace avro-1.7.4.jar avro-ipc-1.7.4.jar avro-mapred-1.7.4.jar in lib dir using avro-1.7.7.jar avro-ipc-1.7.7.jar avro-mapred-1.7.7.jar
-----------
Setting up Eclipse
mvn eclipse:eclipse -DdownloadSources -DdownloadJavadocs
Once this command completes successfully, you must add $HOME/.m2/repository to the classpath in preferences and then you can import all the flume modules as interdependent projects into Eclipse by going to File > Import > General > Existing Projects into Workspace.
References
https://cwiki.apache.org/confluence/display/FLUME/Getting+Started
https://cwiki.apache.org/confluence/display/FLUME/Development+Environment
相关推荐
flume:构建高可用、可扩展的海量日志采集系统 flume:构建高可用、可扩展的海量日志采集系统
其中上篇介绍了HDFS以及流式数据/日志的问题,同时还谈到了Flume是如何解决这些问题的。本书展示了Flume的架构,包括将数据移动到数据库中以及从数据库中获取数据、NoSQL数据存储和性能调优。对于各个架构组件(源、...
由于flume官方并未提供ftp,source的支持; 因此想使用ftp文件服务器的资源作为数据的来源就需要自定义ftpsource,根据github:https://github.com/keedio/flume-ftp-source,提示下载相关jar,再此作为记录。
Flume是Cloudera提供的一个高可用的,高可靠的,分布式的海量日志采集、聚合和传输的系统,Flume支持在日志系统中定制各类数据发送方,用于收集数据;同时,Flume提供对数据进行简单处理,并写到各种数据接受方(可...
Flume:构建高可用、可扩展的海量日志采集系统 第一部分
Apache Flume, Distributed Log Collection for Hadoop,2015 第二版,Packt Publishing
flume-ng安装
flume断点续传覆盖jar,使用组件flume-taildir-source-1.9.0覆盖flume/bin目录下的jar即可
Flume配置文件kafkaSource 包含Intercepter,包含正则表达式。
Title: Using Flume: Flexible, Scalable, and Reliable Data Streaming Author: Hari Shreedharan Length: 238 pages Edition: 1 Language: English Publisher: O'Reilly Media Publication Date: 2014-10-02 ISBN-...
flume-ng-sql-source 该项目用于与sql数据库进行通信 当前支持SQL数据库引擎 在最后一次更新之后,该代码已与hibernate集成在一起,因此该技术支持的所有数据库均应正常工作。 编译与包装 $ mvn package 部署方式 ...
欢迎使用Apache Flume! Apache Flume是一种分布式,可靠且可用的服务,用于有效地收集,聚合和移动大量日志数据。 它具有基于流数据流的简单灵活的体系结构。 它具有可调整的可靠性机制以及许多故障转移和恢复机制...
Flume1.6.0入门:安装、部署、及flume的案例
rocketmq-flume Source&Sink该项目用于与之间的消息接收和投递。首先请确定您已经对和有了基本的了解确保本地maven库中已经存在,或者下载RocketMQ源码自行编译在rocketmq-flume项目根目录执行mvn clean install ...
改了了flume的sqlsource的源码,直接可以根据时间做增量,解决了之前一定要使用递增主键的增量方式,可以使用任意字段做增量,使用起来更方便。
Apache Flume: Distributed Log Collection for Hadoop covers problems with HDFS and streaming data/logs, and how Flume can resolve these problems. This book explains the generalized architecture of ...
# thrift source a1.sources.r1.type = thrift a1.sources.r1.channels = c1 a1.sources.r1.bind = 0.0.0.0 a1.sources.r1.port = 4141 a1.sources.r1.protocol = binary 笔记 后端将: 发现不正确的配置时不启动...
flume-ng-sql-source-1.5.1 flume连接数据库 很好用的工具
Apache Flume: Distributed Log Collection for Hadoop covers problems with HDFS and streaming data/logs, and how Flume can resolve these problems. This book explains the generalized architecture of ...