GithubHelp home page GithubHelp logo

isabella232 / azure-event-hub Goto Github PK

View Code? Open in Web Editor NEW

This project forked from data-integrations/azure-event-hub

0.0 0.0 0.0 202 KB

Azure Event Hub Spark Streaming source to read events from an event bus

Home Page: http://docs.cask.co/cdap

Java 87.45% Scala 12.55%

azure-event-hub's Introduction

Azure Event Hub

Join CDAP community Build Status License CDAP Realtime Source

Azure Event Hub streaming source. Emits a record with the schema specified by the user. If no schema is specified, it will emit a record with 'message'(bytes).

plugin configuration

Usage Notes

Azure Event Hub will read events from provided event hub and converts them into structured records so that they can be processed by rest of the CDAP pipeline. It will use Shared access policy name and key to access that event hub on the azure cluster.

Each event hub can have multiple number of partitions (from 2 to 32). If it is a non-integer value, pipeline deployment will fail.

Since this is a spark streaming source, internally uses Azure Event Hub spark streaming scala api to read events from all the partitions of event hub.

Note that this plugin requires Java 8 Runtime Environment. This plugin is supported in both Spark1 and Spark2.

Plugin Configuration

Configuration Required Default Description
Azure Event Hub Namespace Y N/A Azure Event Hub namespace under which event hub is present.
Event Hub Name Y N/A Name of the Azure Event Hub under provided namespace.
Shared Access Policy Name Y N/A Name of the policy for the provided event hub. This can be found under Shared Access Policies section of the event hub.
Shared Access Policy Key Y N/A Primary key to access the provided event hub.
Number of partitions Y N/A Number of partitions of provided Event Hub. Please make sure this number is right otherwise some messages may not be consumed.
Checkpoint Directory Y N/A HDFS directory location where offsets for each partitions will be stored.
Checkpoint Interval (seconds) N 10 Checkpoint interval in seconds. If not specified, it will default to 10 seconds.
Consumer Group N $default Event hub consumer group name, defaults to $default.
Per Partition Starting Offset N -1 Specify list of partitions for which offset needs to be changed. Defaults to -1 which means all the events in the hub will be read from the beginning.
Format N bytes Optional format of the event message. Any format supported by CDAP is supported. For example, a value of 'csv' will attempt to parse event as comma-separated values. If no format is given, event will be treated as bytes.

Build

To build this plugin in Scala 2.10 and spark 1:

   mvn clean package -P scala-210 

To build this plugin in Scala 2.11 and spark 2:

   mvn clean package -P scala-211

The build will create a .jar and .json file under the target directory. These files can be used to deploy your plugins.

Deployment

You can deploy your plugins using the CDAP CLI:

> load artifact <target/azure-event-hub-<version>.jar config-file <target/azure-event-hub-<version>.json>

For example, if your artifact is named 'azure-event-hub-':

> load artifact target/azure-event-hub-<version>.jar config-file target/azure-event-hub-<version>.json

Mailing Lists

CDAP User Group and Development Discussions:

The cdap-user mailing list is primarily for users using the product to develop applications or building plugins for appplications. You can expect questions from users, release announcements, and any other discussions that we think will be helpful to the users.

License and Trademarks

Copyright © 2017 Cask Data, Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Cask is a trademark of Cask Data, Inc. All rights reserved.

Apache, Apache HBase, and HBase are trademarks of The Apache Software Foundation. Used with permission. No endorsement by The Apache Software Foundation is implied by the use of these marks.

azure-event-hub's People

Contributors

curiousvini avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.