Flume-ng分佈式環境的部署和配置(一)

1.實驗場景

操作系統:Centos 5.5

JDK版本:1.7.0_21

Flume版本:1.3.1

Hadoop版本:0.20.2


配置1agent ,2collector,1storage


2.安裝步驟JDK+flume

#下載安裝jdk1.7

http://www.oracle.com/technetwork/java/javase/downloads/index.html

tar zxvf jdk-7u21-linux-x64.gz -C  /usr/local/


#/etc/profile增加環境變量

pathmunge  /usr/local/jdk1.7.0_21/bin

export  JAVA_HOME=/usr/local/jdk1.7.0_21/

export  JRE_HOME=/usr/local/jdk1.7.0_21/jre

export  CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar


#驗證java

java  -version


#下載安裝flume 1.3.1

Flume的下載地址

http://www.apache.org/dyn/closer.cgi/flume/1.3.1/apache-flume-1.3.1-bin.tar.gz

tar zxvf  flume-distribution-0.9.4-bin.tar.gz -C /usr/local/


#/etc/profile增加環境變量

export  FLUME_HOME=/usr/local/apache-flume-1.3.1-bin

export  FLUME_CONF_DIR=$FLUME_HOME/conf

export  PATH=.:$PATH::$FLUME_HOME/bin


#驗證 flume

# flume-ng  version

Flume 1.3.1

Source code  repository: https://git-wip-us.apache.org/repos/asf/flume.git

Revision: 77b5d2885fecb3560a873bd89f49cbac8a010347

Compiled by  hshreedharan on Fri Dec 21 22:14:21 PST 2012

From source  with checksum 2565bdfd8b6af459dbf85c6960f189a5


3.一個簡單的例子

#設置配置文件

[root@cc-staging-loginmgr2  conf]# cat example.conf

#  example.conf: A single-node Flume configuration


# Name the  components on this agent

a1.sources  = r1

a1.sinks =  k1

a1.channels  = c1


#  Describe/configure the source

a1.sources.r1.type  = netcat

a1.sources.r1.bind  = localhost

a1.sources.r1.port  = 44444


# Describe  the sink

a1.sinks.k1.type  = logger


# Use a  channel which buffers events in memory

a1.channels.c1.type  = memory

a1.channels.c1.capacity  = 1000

a1.channels.c1.transactionCapacity  = 100


# Bind the  source and sink to the channel

a1.sources.r1.channels  = c1

a1.sinks.k1.channel  = c1


#命令參數說明

-c conf 指定配置目錄爲conf

-f  conf/example.conf 指定配置文件爲conf/example.conf  

-n a1 指定agent名字爲a1,需要與example.conf中的一致

-Dflume.root.logger=INFO,console  指定DEBUF模式在console輸出INFO信息


#啓動agent

cd /usr/local/apache-flume-1.3.1-bin

flume-ng  agent -c conf -f conf/example.conf -n a1 -Dflume.root.logger=INFO,console


2013-05-24 00:00:09,288  (lifecycleSupervisor-1-0) [INFO -  org.apache.flume.source.NetcatSource.start(NetcatSource.java:150)] Source starting

2013-05-24 00:00:09,303  (lifecycleSupervisor-1-0) [INFO -  org.apache.flume.source.NetcatSource.start(NetcatSource.java:164)] Created  serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:44444]


#在另一個終端進行測試

[root@cc-staging-loginmgr2  conf]# telnet 127.0.0.1 44444

Trying  127.0.0.1...

Connected to  localhost.localdomain (127.0.0.1).

Escape  character is '^]'.

hello world!  

OK


#在啓動的終端查看console輸出

2013-05-24  00:00:24,306 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO -  org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: {  headers:{} body: 68 65 6C 6C 6F 20 77 6F 72 6C 64 21 0D          hello  world!. }


#測試成功,flume可以正常使用


4. Flume Source測試

測試1:

avro source可以發送一個給定的文件給FlumeAvro 源使用AVRO RPC機制

#設置avro配置文件

[root@cc-staging-loginmgr2  conf]# cat avro.conf

# Name the  components on this agent

a1.sources =  r1

a1.sinks =  k1

a1.channels  = c1


#  Describe/configure the source

a1.sources.r1.type  = avro

a1.sources.r1.channels  = c1

a1.sources.r1.bind  = 0.0.0.0

a1.sources.r1.port  = 4141


# Describe  the sink

a1.sinks.k1.type  = logger


# Use a  channel which buffers events in memory

a1.channels.c1.type  = memory

a1.channels.c1.capacity  = 1000

a1.channels.c1.transactionCapacity  = 100


# Bind the  source and sink to the channel

a1.sources.r1.channels  = c1

a1.sinks.k1.channel  = c1


#啓動flume agent a1

cd /usr/local/apache-flume-1.3.1-bin/conf

flume-ng  agent -c . -f avro.conf -n a1 -Dflume.root.logger=INFO,console


#創建指定文件

echo "hello world" > /usr/logs/log.10


#使用avro-client發送文件

flume-ng  avro-client -c . -H localhost -p 4141 -F /usr/logs/log.10


#在啓動的終端查看console輸出

2013-05-27  01:11:45,852 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO -  org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: {  headers:{} body: 68 65 6C 6C 6F 20 77 6F 72 6C 64                hello  world }


測試2:

Exec source  runs a given Unix command on start-up and expects that process to  continuously produce data on standard out


#修改的配置文件

[root@cc-staging-loginmgr2  conf]# cat exec.conf

#  Describe/configure the source

a1.sources.r1.type  = exec

a1.sources.r1.command  = cat /usr/logs/log.10

a1.sources.r1.channels  = c1



#啓動flume agent a1

cd /usr/local/apache-flume-1.3.1-bin/conf

flume-ng  agent -c . -f exec.conf -n a1 -Dflume.root.logger=INFO,console


#追加內容到文件

echo  "exec test" >> /usr/logs/log.10


#在啓動的終端查看console輸出

2013-05-27  01:50:12,825 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO -  org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: {  headers:{} body: 68 65 6C 6C 6F 20 77 6F 72 6C 64                hello world }

2013-05-27  01:50:12,826 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO -  org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: {  headers:{} body: 65 78 65 63 20 74 65 73 74                      exec test }


#如果要使用tail命令,必選使得file足夠大才能看到輸出內容

a1.sources.r1.command  = tail -F /usr/logs/log.10


#生成足夠多的內容在文件裏

for i in {1..100};do echo  "exec test$i" >> /usr/logs/log.10;echo $i;done


#可以在console看到output

2013-05-27  19:17:18,157 (lifecycleSupervisor-1-1) [INFO -  org.apache.flume.source.ExecSource.start(ExecSource.java:155)] Exec source  starting with command:tail -n 5 -F /usr/logs/log.10

2013-05-27 19:19:50,334  (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO -  org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: {  headers:{} body: 65 78 65 63 20 74 65 73 74 37                   exec test7 }


測試3:

Spooling  directory source

This source  lets you ingest data by dropping files in a spooling directory on disk. Unlike other asynchronous sources,  this source avoids data loss even if Flume is restarted or fails.

SpoolSource:是監測配置的目錄下新增的文件,並將文件中的數據讀取出來。需要注意兩點:1) 拷貝到spool目錄下的文件不可以再打開編輯。

2) spool目錄下不可包含相應的子目錄


#修改的配置文件

[root@cc-staging-loginmgr2 conf]# cat  spool.conf

# Describe/configure the source

a1.sources.r1.type = spooldir

a1.sources.r1.spoolDir = /usr/logs/flumeSpool

a1.sources.r1.fileHeader = true

a1.sources.r1.channels = c1


#啓動flume agent a1

cd /usr/local/apache-flume-1.3.1-bin/conf

flume-ng agent -c . -f spool.conf -n a1  -Dflume.root.logger=INFO,console


#追加內容到spool目錄

[root@cc-staging-loginmgr2  ~]# echo "spool test1" >  /usr/logs/flumeSpool/spool1.log


#在啓動的終端查看console輸出

2013-05-27  22:49:06,098 (pool-4-thread-1) [INFO -  org.apache.flume.client.avro.SpoolingFileLineReader.retireCurrentFile(SpoolingFileLineReader.java:229)]  Preparing to move file /usr/logs/flumeSpool/spool1.log to  /usr/logs/flumeSpool/spool1.log.COMPLETED

2013-05-27  22:49:06,101 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO -  org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: {  headers:{file=/usr/logs/flumeSpool/spool1.log} body: 73 70 6F 6F 6C 20 74 65  73 74 31                spool test1 }


測試4

Netcat source 參見第3部分一個簡單的例子


測試5

Syslog tcp source


#修改的配置文件

[root@cc-staging-loginmgr2 conf]#  cat syslog.conf

# Describe/configure the source

a1.sources.r1.type = syslogtcp

a1.sources.r1.port = 5140

a1.sources.r1.host = localhost

a1.sources.r1.channels = c1


#啓動flume agent a1

cd /usr/local/apache-flume-1.3.1-bin/conf

flume-ng agent -c . -f syslog.conf  -n a1 -Dflume.root.logger=INFO,console


#測試產生syslog, <37>因爲需要wire format數據,否則會報錯” Failed to extract syslog wire entry”

echo "<37>hello  via syslog"  | nc localhost 5140


#在啓動的終端查看console輸出

2013-05-27 23:39:10,755  (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO -  org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: {  headers:{Severity=5, Facility=4} body: 68 65 6C 6C 6F 20 76 69 61 20 73 79 73  6C 6F 67 hello via syslog }


#UDP需要修改配置文件

a1.sources.r1.type = syslogudp

a1.sources.r1.port = 5140

a1.sources.r1.host = localhost

a1.sources.r1.channels = c1


#測試產生syslog

echo  "<37>hello via syslog"   | nc -u localhost 5140


測試6

HTTP source JSONHandler


#修改的配置文件

[root@cc-staging-loginmgr2 conf]#  cat post.conf

# Describe/configure the source

a1.sources = r1

a1.channels = c1

a1.sources.r1.type =  org.apache.flume.source.http.HTTPSource

a1.sources.r1.port = 5140

a1.sources.r1.channels = c1


#啓動flume agent a1

cd /usr/local/apache-flume-1.3.1-bin/conf

flume-ng agent -c . -f post.conf -n  a1 -Dflume.root.logger=INFO,console


#生成JSON 格式的POST request

curl -X POST -d '[{  "headers" :{"namenode" :  "namenode.example.com","datanode" :  "random_datanode.example.com"},"body" :  "really_random_body"}]' http://localhost:5140


#在啓動的終端查看console輸出

2013-05-28 01:17:47,186  (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO -  org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: {  headers:{namenode=namenode.example.com, datanode=random_datanode.example.com}  body: 72 65 61 6C 6C 79 5F 72 61 6E 64 6F 6D 5F 62 6F really_random_bo }



發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章