GithubHelp home page GithubHelp logo

isabella232 / layer-apache-flume-kafka Goto Github PK

View Code? Open in Web Editor NEW

This project forked from juju-solutions/layer-apache-flume-kafka

0.0 0.0 0.0 60 KB

Charm that connects Kafka to Flume HDFS

License: Apache License 2.0

Python 100.00%

layer-apache-flume-kafka's Introduction

Overview

Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many fail over and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application. Learn more at flume.apache.org.

This charm provides a Flume agent designed to ingest messages published to a Kafka topic and send them to the apache-flume-hdfs agent for storage in the shared filesystem (HDFS) of a connected Hadoop cluster. This leverages the KafkaSource jar packaged with Flume. Learn more about the Flume Kafka Source.

Deploying

This charm requires Juju 2.0 or greater. If Juju is not yet set up, please follow the getting-started instructions prior to deploying this charm.

This charm is intended to be deployed via the hadoop-kafka bundle:

juju deploy hadoop-kafka

This will deploy an Apache Bigtop Hadoop cluster with Apache Flume and Apache Kafka. More information about this deployment can be found in the bundle readme.

Network-Restricted Environments

Charms can be deployed in environments with limited network access. To deploy in this environment, configure a Juju model with appropriate proxy and/or mirror options. See Configuring Models for more information.

Configuring

The default Kafka topic where messages are published is unset. Set this to an existing Kafka topic as follows:

juju config flume-kafka kafka_topic='<topic_name>'

If you don't have a Kafka topic, you may create one (and configure this charm to use it) with:

juju run-action kafka/0 create-topic topic=<topic_name> \
  partitions=1 replication=1
juju show-action-output <id>  # <-- id from above command
juju config flume-kafka kafka_topic='<topic_name>'

Once the Flume agents start, messages will start flowing into HDFS in year-month-day directories here: /user/flume/flume-kafka/%y-%m-%d.

Testing

A Kafka topic is required for this test. Topic creation is covered in the Configuration section above. Generate Kafka messages with the write-topic action:

juju run-action kafka/0 write-topic topic=<topic_name> data="This is a test"

To verify these messages are being stored into HDFS, SSH to the flume-hdfs unit, locate an event, and cat it:

juju ssh flume-hdfs/0
hdfs dfs -ls /user/flume/flume-kafka  # <-- find a date
hdfs dfs -ls /user/flume/flume-kafka/yyyy-mm-dd  # <-- find an event
hdfs dfs -cat /user/flume/flume-kafka/yyyy-mm-dd/FlumeData.[id]

Contact Information

Resources

layer-apache-flume-kafka's People

Contributors

johnsca avatar ktsakalozos avatar kwmonroe avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.