GithubHelp home page GithubHelp logo

cyruschan360 / cogstack-nifi Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cogstack/cogstack-nifi

0.0 0.0 0.0 76.57 MB

Building data processing pipelines for documents processing with NLP using Apache NiFi and related services

Home Page: https://hub.docker.com/r/cogstacksystems/cogstack-nifi/

License: Other

Shell 23.85% Python 38.76% Groovy 6.72% R 0.23% Makefile 1.59% Jupyter Notebook 25.37% Dockerfile 2.95% TSQL 0.53%

cogstack-nifi's Introduction

Introduction

This repository proposes a possible next step for the free-text data processing capabilities implemented as CogStack-Pipeline, shaping the solution more towards Platform-as-a-Service.

CogStack-NiFi contains example recipes using Apache NiFi as the key data workflow engine with a set of services for documents processing with NLP. Each component implementing key functionality, such as Text Extraction or Natural Language Processing, runs as a service where the data routing between the components and data source/sink is handled by Apache NiFi. Moreover, NLP services are expected to implement an uniform RESTful API to enable easy plugging-in into existing document processing pipelines, making it possible to use any NLP application in the stack.

Important

Please note that the project is under constant improvement, brining new features or services that might impact current deployments, please be aware as this might affect you, the user, when making upgrades, so be sure to check the release notes and the documentation beforehand.

Asking questions

Feel free to ask questions on the github issue tracker or on our discourse website which is frequently used by our development team!

Project organisation

The project is organised in the following directories:

  • nifi - custom Docker image of Apache NiFi with configuration files, drivers, example workflows and custom user resources.
  • security - scripts to generate SSL keys and certificates for Apache NiFi and related services (when needed) with other security-related requirements.
  • services - available services with their corresponding configuration files and resources.
  • deploy - an example deployment of Apache NiFi with related services.
  • scripts - helper scripts containing setup tools, sample ES ingestion, bash ingestion into DB samples etc.
  • data - any data that you wish to ingest should be placed here.

Documentation and getting started

Knowledge requirements: Docker usage (mandatory), Python, Linux/UNIX understarting.

Official documentation now available here.

As a good starting point, deployment walks through an example deployment with some workflow examples.

All issues are tracked in README, check that section before opening a bug report ticket.

Important news and updates

Please check IMPORTANT_NEWS for any major changes that might affect your deployment and security problems that have been discovered.

cogstack-nifi's People

Contributors

vladd-bit avatar lrog avatar tomolopolis avatar baixiac avatar sandertan avatar kawsarnoor avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.