flume源码分析

frankfan915

浏览: 350117 次
性别:
来自: 杭州

最近访客更多访客>>

gaojingsong

javacoo

449582981

nick_jian

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Flume

flume是一个高可靠性的分布式的大文件收集系统。它提供了transaction来保证数据不会丢失。

flume官网：http://flume.apache.org/

Flume文档：http://flume.apache.org/FlumeUserGuide.html，http://flume.apache.org/FlumeDeveloperGuide.html

安装：从官网下载flume，然后解压

启动：nohup bin/flume-ng agent --conf <conf_file_path> --conf-file <conf_file> --name <agent_name> -Dflume.root.logger=DEBUG,console &

Flume主要包含三部分:source,channel,sink. source 用于接收数据，channel是一个缓冲通道，sink发送数据到目的端。source可以配置多个channel。channel可以通过channelSelect来选择发往那个channel。可以配置往每个channel发送，也可以配置一个参数，当满足特定值时，发往某个channel。每个channel可以配置多个sink。通过sinkprocess来做load balance，或者failover。

flume-ng命令会调用Application的main函数，如果需要reload configure 文件，则注册application到eventBus中，当文件变更时，调用application的handleConfigurationEvent方法

 public static void main(String[] args) {
      Application application;
      if(reload) {
        EventBus eventBus = new EventBus(agentName + "-event-bus");
        PollingPropertiesFileConfigurationProvider configurationProvider =
            new PollingPropertiesFileConfigurationProvider(agentName,
                configurationFile, eventBus, 30);
        components.add(configurationProvider);
        application = new Application(components);
        eventBus.register(application);
      } else {
        PropertiesFileConfigurationProvider configurationProvider =
            new PropertiesFileConfigurationProvider(agentName,
                configurationFile);
        application = new Application();
        application.handleConfigurationEvent(configurationProvider.getConfiguration());
      }
      application.start();
  }

application中的start方法会调用supervisor.supervise（），这个方法会尝试调用component的start方法，component列表中包含了PollingPropertiesFileConfigurationProvider对象，这个对象的start方法启动了一个线程来监控文件的变更，初始状态文件是变更的，接着就会调用application的handleConfigurationEvent方法

  public synchronized void start() {
    for(LifecycleAware component : components) {
      supervisor.supervise(component,
          new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
    }
  }

在 handleConfigurationEvent中先调用 PropertiesFileConfigurationProvider的getConfiguration方法，这个方法通过配置文件创建了source，sink,channel，并调用了各个组件的configure方法，然后调用了startAllComponents方法，启动了channel，source，sink，并且加载了monitor，用于监控flume的metrics

 private void startAllComponents(MaterializedConfiguration materializedConfiguration) {
    logger.info("Starting new configuration:{}", materializedConfiguration);

    this.materializedConfiguration = materializedConfiguration;

    for (Entry<String, Channel> entry :
      materializedConfiguration.getChannels().entrySet()) {
      try{
        logger.info("Starting Channel " + entry.getKey());
        supervisor.supervise(entry.getValue(),
            new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
      } catch (Exception e){
        logger.error("Error while starting {}", entry.getValue(), e);
      }
    }

    /*
     * Wait for all channels to start.
     */
    for(Channel ch: materializedConfiguration.getChannels().values()){
      while(ch.getLifecycleState() != LifecycleState.START
          && !supervisor.isComponentInErrorState(ch)){
        try {
          logger.info("Waiting for channel: " + ch.getName() +
              " to start. Sleeping for 500 ms");
          Thread.sleep(500);
        } catch (InterruptedException e) {
          logger.error("Interrupted while waiting for channel to start.", e);
          Throwables.propagate(e);
        }
      }
    }

    for (Entry<String, SinkRunner> entry : materializedConfiguration.getSinkRunners()
        .entrySet()) {
      try{
        logger.info("Starting Sink " + entry.getKey());
        supervisor.supervise(entry.getValue(),
          new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
      } catch (Exception e) {
        logger.error("Error while starting {}", entry.getValue(), e);
      }
    }

    for (Entry<String, SourceRunner> entry : materializedConfiguration
        .getSourceRunners().entrySet()) {
      try{
        logger.info("Starting Source " + entry.getKey());
        supervisor.supervise(entry.getValue(),
          new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
      } catch (Exception e) {
        logger.error("Error while starting {}", entry.getValue(), e);
      }
    }

    this.loadMonitoring();
  }