GithubHelp home page GithubHelp logo

nopcoder / dory Goto Github PK

View Code? Open in Web Editor NEW

This project forked from dspeterson/dory

0.0 3.0 0.0 3.65 MB

Producer daemon for Apache Kafka

License: Other

Shell 0.31% Python 6.52% Java 0.54% JavaScript 0.38% Perl 0.32% PHP 0.43% Ruby 0.26% C++ 89.48% C 1.76%

dory's Introduction

Dory

Dory

Dory is a producer daemon for Apache Kafka. Dory simplifies clients that send messages to Kafka, freeing them from the complexity of direct interaction with the Kafka cluster. Specifically, it handles the details of:

  • Routing messages to the proper Kafka brokers, and spreading the load evenly across multiple partitions for a given topic. Clients may optionally exercise control over partition assignment, such as ensuring that a group of related messages are all routed to the same partition, or even directly choosing a partition if the client knows the cluster topology.
  • Waiting for acknowledgements, and resending messages as necessary due to communication failures or Kafka-reported errors
  • Buffering messages to handle transient load spikes and Kafka-related problems
  • Tracking message discards when serious problems occur; Providing web-based discard reporting and status monitoring interfaces
  • Batching and compressing messages in a configurable manner for improved performance
  • Optional rate limiting of messages on a per-topic basis. This guards against buggy client code overwhelming the Kafka cluster with too many messages.

Dory runs on each individual host that sends messages to Kafka, receiving messages from clients through local interprocess communication and forwarding them to the Kafka cluster. Once a client has written a message, no further interaction with Dory is required. From that point onward, Dory takes full responsibility for reliable message delivery. The preferred method for sending messages to Dory is by UNIX domain datagram socket. However, Dory can also receive messages by UNIX domain stream socket or local TCP. The option of using stream sockets allows sending messages too large to fit in a single datagram. Local TCP facilitates sending messages from clients written in programming languages that do not provide easy access to UNIX domain sockets. Dory serves as a single intake point, receiving messages from diverse clients written in a variety of programming languages. Here are some reasons to consider using Dory:

  • Dory decouples message sources from the Kafka cluster. A client is not forced to wait for an ACK after sending a message, since Dory handles the details of waiting for ACKs from Kafka and resending messages when necessary. Likewise, a client is not burdened with holding onto messages until it has a reasonable-sized batch to send to Kafka. If a client crashes immediately after sending a message to Dory, the message is safe with Dory. However, if the client assumes responsibility for interacting with Kafka, a crash will cause the loss of all batched messages, and possibly sent messages for which an ACK is pending.

  • Dory provides uniformity of mechanism for status monitoring and data quality reporting through its web interface. Likewise, it provides a unified configuration mechanism for settings related to batching, compression, and other aspects of interaction with Kafka. This simplifies system administration, as compared to a multitude of producer mechanisms for various programming languages and applications, each with its own status monitoring, data quality reporting, and configuration mechanisms or lack thereof.

  • Dory may enable more efficient interaction with the Kafka cluster. Dory's C++ implementation is likely to be less resource-intensive than producers written in interpreted scripting languages. Since Dory is capable of serving as a single access point for all clients that send messages to Kafka, it permits more efficient batching by combining messages from multiple client programs into a single batch. Batching behavior is coordinated across all message senders, rather than having each client act independently without awareness of messages from other clients. If Dory assumes responsibility for all message transmission from a client host to a Kafka cluster with N brokers, only a single TCP connection to each broker is required, rather than having each client program maintain its own set of N connections. Scenarios are avoided in which short-lived clients frequently open and close connections to the brokers.

  • Dory simplifies adding producer support for new programming languages and runtime environments. Sending a message to Kafka becomes as simple as writing a message in a simple binary format to a UNIX domain or local TCP socket.

The following client support for sending messages to Dory is currently available:

Code contributions for clients in other programming languages are much appreciated. Technical details on how to send messages to Dory are provided here. Support for running Dory inside a Docker container is also available. Dory requires at least version 0.8 of Kafka, and has been tested on versions 0.8, 0.9, and 0.10. It runs on Linux, and has been tested on CentOS versions 7 and 6.8, and Ubuntu versions 16.04 LTS, 15.04 LTS, 14.04.1 LTS, and 13.10.

Dory is the successor to Bruce, and is maintained by Dave Peterson, who created Bruce while employed at if(we). Code contributions and ideas for new features and other improvements are welcomed and much appreciated. Information for developers interested in contributing is provided here and here. If you have ideas for things you would like to see in future versions of Dory, please add them here.

Setting Up a Build Environment

To get Dory working, you need to set up a build environment. Currently, instructions are available for CentOS 7, CentOS 6.8, Ubuntu (16.04 LTS, 15.04 LTS, 14.04.1 LTS, and 13.10), and Debian 8.

Building and Installing Dory

Once your build environment is set up, the next step is to build and install Dory.

Running Dory with Basic Configuration

Simple instructions for running Dory with a basic configuration can be found here.

Sending Messages

Information on how to send messages to Dory can be found here.

Status Monitoring

Information on status monitoring can be found here.

Design Overview

Before going into more details on Dory's configuration options, it is helpful to have an understanding of Dory's design, which is described here.

Detailed Configuration

Full details of Dory's configuration options are provided here.

Troubleshooting

Information that may help with troubleshooting is provided here.

Developer Information

If you are interested in making custom modifications or contributing to Dory, information is provided here.

Getting Help

If you have questions about Dory, contact Dave Peterson ([email protected]).


README.md: Copyright 2014 if(we), Inc.

README.md is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

You should have received a copy of the license along with this work. If not, see http://creativecommons.org/licenses/by-sa/4.0/.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.