flume实时监控日志文件到hdfs
1、复制jar包到flume/lib下
commons-configuration-1.6.jar commons-io-2.4.jar hadoop-annotations-2.7.6.jar hadoop-auth-2.7.6.jar hadoop-common-2.7.6.jar hadoop-hdfs-2.7.6.jar htrace-core-3.1.0-incubation.jar
2、创建flume-hdfs.conf文件 vim flume-hdfs.conf
#name the components on this agent a2.sources = r2 a2.sinks = k2 a2.channels = c2 # Describe/configure the source a2.sources.r2.type = exec a2.sources.r2.command = tail -F /tmp/haitao/hive.log a2.sources.r2.bind = hadoop002 a2.sources.r2.shell = /bin/bash -c # Describe the sink a2.sinks.k2.type = hdfs a2.sinks.k2.hdfs.path = hdfs://hadoop002:9000/flume/%Y%m%d/%H #上传文件的前缀 a2.sinks.k2.hdfs.filePrefix = logs-haitao- #是否按照时间滚动文件夹 a2.sinks.k2.hdfs.round = true #多少时间单位创建一个新的文件夹 a2.sinks.k2.hdfs.roundValue = 1 #重新定义时间单位 a2.sinks.k2.hdfs.roundUnit = hour #是否使用本地时间戳 a2.sinks.k2.hdfs.useLocalTimeStamp = true #积攒多少个Event才flush到HDFS一次 #a2.sinks.k2.hdfs.batchSize = 1000 #设置文件类型,可支持压缩 a2.sinks.k2.hdfs.fileType = DataStream #多久生成一个新的文件 a2.sinks.k2.hdfs.rollInterval = 60 #设置每个文件的滚动大小 a2.sinks.k2.hdfs.rollSize = 134217700 #文件的滚动与Event数量无关 a2.sinks.k2.hdfs.rollCount = 0 #最小冗余数 a2.sinks.k2.hdfs.minBlockReplicas = 1 # Use a channel which buffers events in memory a2.channels.c2.type = memory a2.channels.c2.capacity = 1000 a2.channels.c2.transactionCapacity = 100 # Bind the source and sink to the channel a2.sources.r2.channels = c2 a2.sinks.k2.channel = c2 ————————————————
3、执行监控配置
首先进入flume安装目录 cd /usr/local/flume
bin/flume-ng agent --conf conf/ --name a2 --conf-file job/flume-hdfs.conf
相关推荐
zzjmay 2020-06-07
chenguangchun 2020-04-18
WeiHHH 2020-09-23
tomli 2020-07-26
eternityzzy 2020-07-19
飞鸿踏雪0 2020-07-09
swazerz 2020-06-22
ViMan0 2020-06-21
zzjmay 2020-06-08
sujins 2020-06-05
strongyoung 2020-06-01
sujins 2020-05-30
sujins 2020-05-29
archive 2020-05-28
憧憬 2020-08-21
ViMan0 2020-08-14