1. Web server 集群数据采集采用的架构
2. 在每台web server上启动一个flume agent ( Flume1.3.1 : http://flume.apache.org/download.html ),启动命令为:./bin/flume-ng agent --conf-file ./conf/flume.conf --name a1 -Dflume.root.logger=INFO,console
flume-conf文件如下:
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = avro
a1.sources.r1.bind = localhost
a1.sources.r1.port = 41414
# Describe the sink
a1.sinks.k1.type = logger
# avro sink
a1.sinks.k1.type = avro
a1.sinks.k1.channel = c1
a1.sinks.k1.hostname = consolidationServerHost
a1.sinks.k1.port = 41414
# Use a channel which buffers events in file
a1.channels = c1
a1.channels.c1.type = file
a1.channels.c1.checkpointDir = /mnt/flume/checkpoint
a1.channels.c1.dataDirs = /mnt/flume/data
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
3. 日志合并服务器上的启动一个flume agent,用于合并来自于web server上的flume agent传送过来的日志。flume-conf文件如下:
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = avro
a1.sources.r1.bind = localhost
a1.sources.r1.port = 41414
a1.sources.r1.interceptors = i1
a1.sources.r1.interceptors.i1.type = timestamp
#hdfs sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://hostname:9000/path/to/log/dir /%Y-%m-%d/%H
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.filePrefix = appName-
a1.sinks.k1.hdfs.rollInterval = 3600
# Use a channel which buffers events in file
a1.channels = c1
a1.channels.c1.type = file
a1.channels.c1.checkpointDir = /mnt/flume/checkpoint
a1.channels.c1.dataDirs = /mnt/flume/data
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
4. 每台web server上的log4j.properties配置文件中加入以下内容:
log4j.rootLogger=INFO,flume
log4j.appender.flume=org.apache.flume.clients.log4jappender.Log4jAppender
log4j.appender.flume.Hostname=localhost
log4j.appender.flume.Port=41414
5. 每台web server上的app的lib库里面添加以下来自于flume的jar包:
avro-1.7.2.jar
avro-ipc-1.7.2.jar
flume-ng-log4jappender-1.3.1.jar
flume-ng-sdk-1.3.1.jar
jackson-core-asl-1.9.3.jar
jackson-mapper-asl-1.9.3.jar
log4j-1.2.16.jar
netty-3.4.0.Final.jar
slf4j-api-1.6.1.jar
6. 测试用的程序如下:
package flume.log4j.test;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.nio.charset.Charset;
import org.apache.log4j.Logger;
public class LogTestApp {
public static void main(String[] args) throws IOException {
Logger logger = Logger.getLogger(LogTestApp.class);
BufferedReader in = new BufferedReader(
new InputStreamReader(System.in, Charset.forName("UTF-8")));
String line;
System.out.println("Initializing Flume log4j appender test.");
System.out.println("Each line entered will be sent to Flume.");
// send this line to Flume
logger.info("LogTestApp initialized");
while ((line = in.readLine()) != null) {
System.out.println("Sending to log4j: " + line);
logger.info(line);
}
}
}
相关推荐
Flume-ng在windows环境搭建并测试+log4j日志通过Flume输出到HDFS 11111
Flume采集Rabbitmq数据同步到HDFS
flime安装+配置+测试+案例(采集日志至HDFS)+理论+搭建错误解决,超详细flum搭建,一篇带你入门flume,通俗易懂,详细步骤注解!!!
log4j输出日志到flume例子,包含log4j配置,flume配置,测试类
利用Flume将MySQL表数据准实时抽取到HDFS、MySQL、Kafka用到的jar包
自己研究大数据多年,写的一个日志数据采集方案笔记,可快速熟悉Flume,Kafka,Hdfs的操作使用,以及相互的操作接口。
用Flume采集多台机器上的多种日志并存储于HDFS, 在集群上使用flume收集多台机器的日志集中到hdfs上,作后面的日志分析
Flume采集Nginx日志到Hive的事务表时需要导入到Flume下的Jar文件,具体使用方式可参见博文:https://blog.csdn.net/l1028386804/article/details/97975539
NULL 博文链接:https://chengjianxiaoxue.iteye.com/blog/2169989
Flume + kafka + log4j构建日志采集系统,附实例及文档。
让你快速认识flume及安装和使用flume1 5传输数据 日志 到hadoop2 2 中文文档 认识 flume 1 flume 是什么 这里简单介绍一下 它是 Cloudera 的一个产品 2 flume 是干什么的 收集日志的 3 flume 如何搜集日志 我们把...
08.flume采集配置案例--采集文件新增内容到HDFS.mp4
Log4j直接发送数据到Flume + Kafka (方式一) 通过flume收集系统日记, 收集的方式通常采用以下. 系统logs直接发送给flume系统, 本文主要记录种方式进行说明. 文章链接,请看:...
flume采集日志所用的jar包,将自动采集生成的日志,配合博客使用。
flume增量读取mysql数据写入到hdfs-附件资源
apache-log4j-1.2.15.jar, apache-log4j-extras-1.0.jar, apache-log4j-extras-1.1.jar, apache-log4j.jar, log4j-1.2-api-2.0.2-javadoc.jar, log4j-1.2-api-2.0.2-sources.jar, log4j-1.2-api-2.0.2.jar, log4j-...
Flume:1.8.0(与Nginx部署在一起) 一、Nginx编译安装 1、官网下载.tar.gz文件,上传至linux服务器 http://nginx.org/en/download.html(建议下载Stable version) 2、解压nginx tar -zxvf nginx-14.0.tar.gz -C /...
07.flume采集配置案例--采集目录中的新文件到HDFS中--配置详解.mp4
大数据采集技术与应用
Flume是Cloudera提供的一个高可用的,高可靠的,分布式的海量日志采集、聚合和传输的系统,Flume支持在日志系统中定制各类数据发送方,用于收集数据;同时,Flume提供对数据进行简单处理,并写到各种数据接受方(可...