flume 设计
2016-10-31 18:51:56 0 举报
Flume是一个分布式、可靠且可用的大数据日志采集、聚合和传输系统。它基于流数据流架构,将数据从多个来源(如日志文件、消息队列等)收集到一个中心位置进行处理和分析。Flume的设计具有高度可扩展性和灵活性,可以轻松地适应不同的数据源和目标。通过使用Flume,用户可以轻松地构建复杂的数据处理管道,实现实时监控、故障排查和数据分析等功能。总之,Flume是一个强大而实用的大数据工具,可以帮助企业更好地管理和利用其海量数据资源。
作者其他创作
大纲/内容
access_log/bi0
# a2 配置 demo a2.sources = r1a2.sinks = k1 k2a2.channels = c1 c2# Describe/configure the sourcea2.sources.r1.type = spooldira2.sources.r1.channels = c1a2.sources.r1.spoolDir = /Users/lzz/work/test/flume_sourcea2.sources.r1.fileHeader = true# kafka sink k1a2.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSinka2.sinks.k1.channel = c1a2.sinks.k1.kafka.topic = testa2.sinks.k1.kafka.bootstrap.servers = localhost:9092a2.sinks.k1.kafka.flumeBatchSize = 20a2.sinks.k1.kafka.producer.acks = 1a2.sinks.k1.kafka.producer.linger.ms = 1#a2.sinks.ki.kafka.producer.compression.type = snappy# hdfs sink k2a2.sinks.k2.type = hdfsa2.sinks.k2.channel = c2a2.sinks.k2.hdfs.path = hdfs://localhost:9000/flume/events/%y-%m-%d/%H%M/%Sa2.sinks.k2.hdfs.fileType=DataStreama2.sinks.k2.hdfs.writeFormat=Texta2.sinks.k2.hdfs.filePrefix = events-a2.sinks.k2.hdfs.round = truea2.sinks.k2.hdfs.roundValue = 10a2.sinks.k2.hdfs.roundUnit = minutea2.sinks.k2.hdfs.useLocalTimeStamp = true# Use a channel which buffers events in memorya2.channels.c1.type = memorya2.channels.c1.capacity = 1000a2.channels.c1.transactionCapacity = 100a2.channels.c2.type = memorya2.channels.c2.capacity = 1000a2.channels.c2.transactionCapacity = 100# Bind the source and sink to the channela2.sources.r1.channels = c1 c2
HDFS
kafka/bi4
flume/bi0
Uba_log/bi0
recommend/bi4
dw/数据仓库
0 条评论
下一页