GithubHelp home page GithubHelp logo

kafka-ops / julie Goto Github PK

View Code? Open in Web Editor NEW
417.0 15.0 113.0 2.75 MB

A solution to help you build automation and gitops in your Apache Kafka deployments. The Kafka gitops!

License: MIT License

Java 94.07% Shell 5.37% Dockerfile 0.56%
kafka topics configuration acls ci-cd gitops-toolkit kafka-cluster gitops

julie's Introduction

An operational manager for Apache Kafka (Automation, GitOps, SelfService)

Note - Governance - Hibernation state

I'm gratefuly of how many people the JulieOps project has helped during it existance, it is totally mind blowing to get more than 300 starts for a humble human like me, thanks everyone!!.

Sadly this days, between my workload and personal arrangements, the project has been lacking proper mantainance and care, what honestly makes me very sad as I would love to see it grow and provide more and more people with such features, I'm a big beliver of self service and automation.

So, until new notice, or something change, you should take the project with care, as currently it is mostly on a long winter hibernation :-) I'm sorry for this, but I can't do more as a mostly sole mantainer.

Thanks again to everyone who was, is or will be involved with the project life.

panda

-- Pere

README

NOTE: This project was formally known as Kafka Topology Builder, old versions of this project can still be found under that name.

CI tests Gitter Documentation Status

JulieOps helps you automate the management of your things within Apache Kafka, from Topics, Configuration to Metadata but as well Access Control, Schemas. More items are plan, check here for details.

The motivation

One of the typical questions while building an Apache Kafka infrastructure is how to handle topics, configurations and the required permissions to use them (Access Control List).

The JulieOps cli, in close collaboration with git and Jenkins (CI/CD) is here to help you setup an organised and automated way of managing your Kafka Cluster.

Where's the docs?

We recommend taking time to read the docs. There's quite a bit of detailed information about GitOps, Apache Kafka and how this project can help you automate the common operational tasks.

Automating Management with CI/CD and GitOps

JulieOps

You might be wondering what is the usual workflow to implement this approach:

Action: As a user, part of a developer team (for example), I like to have some changes in Apache Kafka.

Change Request: As a user:

  • Go to the git repository where the topology is described
  • Create a new branch
  • Perform the changes need
  • Make a pull request targeting master branch

Approval process: As an ops admin, I can:

  • Review the pull request (change request) initiated by teams
  • Request changes when need
  • Merge the requests.

Considerations:

  • Using webhooks, the git server (github, gitlab or bitbucket) will inform the CI/CD system changes had happened and the need to apply them to the cluster.
  • All changes (git push) to master branch are disabled directly. Changes only can happen with a pull request. Providing a Change Management mechanism to fit into your org procedures.

Help??

If you are using the JulieOps tool, or plan to use it in your project? might be you have encounter a bug? or a challenge? need a certain future? feel free to reach out into our gitter community.

Gitter

Feature list, not only bugs ;-)

What can you achieve with this tool:

  • Support for multiple access control mechanisms:
    • Traditional ACLs
    • Role Bases Access Control as provided by Confluent
  • Automatically set access control rules for:
    • Kafka Consumers
    • Kafka Producers
    • Kafka Connect
    • Kafka Streams applications ( microservices )
    • KSQL applications
    • Schema Registry instances
    • Confluent Control Center
    • KSQL server instances
  • Manage topic naming with a topic name convention
    • Including the definition of projects, teams, datatypes and for sure the topic name
    • Some of the topics are flexible defined by user requirements
  • Allow for creation, delete and update of:
    • topics, following the topic naming convention
    • Topic configuration, variables like retention, segment size, etc
    • Acls, or RBAC rules
    • Service Accounts (Experimental feature only available for now in Confluent Cloud)
  • Manage your cluster schemas.
    • Support for Confluent Schema Registry

Out of the box support for Confluent Cloud and other clouds that enable you to use the AdminClient API.

How can I run JulieOps directly?

This tool is available in multiple formats:

  • As a Docker image, available from docker hub
  • As an RPM package, for the RedHat alike distributions
  • As a DEB package, for Debian based distros
  • Directly as a fat jar (zip/tar.gz)
  • As a fat jar.

The latest version are available from the releases page.

How to execute the tool

This is how you can run the tool directly as a docker image:

docker run purbon/kafka-topology-builder:latest julie-ops-cli.sh  --help
Parsing failed cause of Missing required options: topology, brokers, clientConfig
usage: cli
    --brokers <arg>                  The Apache Kafka server(s) to connect
                                     to.
    --clientConfig <arg>             The client configuration file.
    --dryRun                         Print the execution plan without
                                     altering anything.
    --help                           Prints usage information.
    --overridingClientConfig <arg>   The overriding AdminClient
                                     configuration file.
    --plans <arg>                    File describing the predefined plans
    --quiet                          Print minimum status update
    --topology <arg>                 Topology config file.
    --validate                       Only run configured validations in
                                     your topology
    --version                        Prints useful version information.

If you install the tool as rpm, you will have available in your $PATH the julie-ops-cli.sh. You can run this script with the same options observed earlier, however you will need to be using, or be in the group, for the user julie-kafka.

An example topology

An example topology should look like this (in yaml format):

context: "context"
source: "source"
projects:
- name: "foo"
  consumers:
  - principal: "User:app0"
  - principal: "User:app1"
  streams:
  - principal: "User:App0"
    topics:
      read:
      - "topicA"
      - "topicB"
      write:
      - "topicC"
      - "topicD"
  connectors:
  - principal: "User:Connect1"
    topics:
      read:
      - "topicA"
      - "topicB"
  - principal: "User:Connect2"
    topics:
      write:
      - "topicC"
      - "topicD"
  topics:
  - name: "foo" # topicName: context.source.foo.foo
    config:
      replication.factor: "2"
      num.partitions: "3"
  - name: "bar" # topicName: context.source.foo.bar
    config:
      replication.factor: "2"
      num.partitions: "3"
- name: "bar"
  topics:
  - name: "bar" # topicName: context.source.bar.bar
    config:
      replication.factor: "2"
      num.partitions: "3"

more examples can be found at the example/ directory.

Also, please check, the documentation in the docs for extra information and examples on managing ACLs, RBAC, Principales, Schemas and many others.

Troubleshooting guides

If you're having problems with JulieOps I would recommend lookup up two main sources of information:

  • The project issues tracker. Highly possible others might have had your problem before.
  • Our always work in progress troubleshooting guide

Interested in contributing back?

Interested on contributing back? might be have an idea for a great future? or wanna fix a bug? Check our contributing doc for guidance.

Building JulieOps from scratch (source code)

The project is build using Java and Maven, so both are required if you aim to build the tool from scratch. The minimum version of Java supported is Java 8, note it soon will be deprecated here, it is only keep as supported for very legacy environments.

It is recommended to run JulieOps with Java 11 and an open JDK version.

Building a release

If you are interested on building a release artifact from the source code, check our release doc for guidance.

Nightly builds as well as release builds are regularly available from the Actions in this project.

Nightly release build are available as well from here.

julie's People

Contributors

akselh avatar bjaggi avatar cedillomarcos avatar christophschubert avatar danielmabbett avatar dependabot[bot] avatar fobhep avatar grampurohitp avatar jeqo avatar jniebuhr avatar khaes-kth avatar kikulikov avatar leonardobonacci avatar lsolovey avatar ludovic-boutros avatar magnussmith avatar michaelandrepearce avatar michaelpearce-gain avatar mikaello avatar minyibii avatar mvanbrummen avatar nachomdo avatar niyiodumosu avatar piotrsmolinski avatar purbon avatar schnaker85 avatar schocco avatar sknop avatar solita-juusoma avatar sverrehu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

julie's Issues

Option to store ACL status outside of filesystem

Currently the ACL dump is stored in the local filesystem, however as a user of the Kafka Topology Builder in a k8s environment, i might be interested on having the state on a 3-party storage like a k-v store.

Restructure header of descriptor file

Currently, the Topology Builder interprets every top-level field in the YAML between the context and projects as an additional naming component. IMHO this leads to two issues:

  1. we cannot add additional meta-data fields to the 'header' of a topology YAML, at least not between context and projects.
  2. This goes against the YAML spec (https://yaml.org/spec/1.2/spec.html), see quote below. Which in turn means we have to rely on hand-written parsers.

Spec Construction of native data structures from the serial interface should not use key order or anchor names for the preservation of application data.

I propose to change the header definition to either

  1. allow a list of entries for the config, or
  2. use a sub-object together with an explicit formatting string.

Example for 1:

context:
- contextValue
- subContexValue1
- subContextValue1

Example for 2:

context:
  main: mainValue
  sub1: sub1Value
  sub2: sub2Value
topicNameFormat:
  {main}.{sub1}.{sub2}.{project.name}.{topic.name}

Add web ui

As a user of the kafka topology builder, I would like to have a simple web ui where users and team can manage their own setup.

Schema Registry client does not support security configs

When the Schema Registry client is built here https://github.com/purbon/kafka-topology-builder/blob/e97fc7390cd7e60bfd0e481b1916bf71815f1e67/src/main/java/com/purbon/kafka/topology/KafkaTopologyBuilder.java#L73, it uses no config constructor. This means we can not pass security parameters.

Can we pass parameters from the tool config (maybe prefixed?) here? and use this constructor:

https://github.com/confluentinc/schema-registry/blob/master/client/src/main/java/io/confluent/kafka/schemaregistry/client/CachedSchemaRegistryClient.java#L106

Platform RBAC limitations

As a (ops)user I would love to be able to do something like this:

---
platform:
  kafka:
    ClusterAdmin:
      - principal: "User:Hans"
    SecurityAdmin:
      - principal: "User:Fritz"
  schema_registry:
    ClusterAdmin:
      - principal: "User:Hans"
    SecurityAdmin:
      - principal: "User:Fritz"

Above being a complete yaml file.
However I think there are currently two things "forbiding" this.

  • One cannot create a descriptor containing "only" platform descriptions

  • There is currently no detailed role-assignment for platform components.
    Using the deployment like in the example file will assign ClusterAdmins, only?

platform:
  schema_registry:
    - principal: "User:SchemaRegistry"



Error with connectors block

Deploying like this will cause an error:

---
team: "planetexpress"
source: "source"
projects:
  - name: "natas"
    consumers:
      - principal: "User:Bender"
      - principal: "User:Fry"
      - principal: "User:Lila"
    producers:
      - principal: "User:Fry"
      - principal: "User:Lila"
    connectors:
      - principal: "User:Connect1"
        topics:
          read:
            - "topicA"
            - "topicB"
      - principal: "User:Connect2"
        topics:
          write:
            - "topicC"
            - "topicD"
    topics:
      - name: "foo"
        config:
          replication.factor: "1"
          num.partitions: "1"
      - dataType: "avro"
        name: "bar"
        config:
          replication.factor: "1"
          num.partitions: "1"
    rbac:
      - ResourceOwner:
        - principal: "User:Professor"
      - DeveloperManage:
        - principal: "User:Zoidberg"

However removing the connectors block will work.

kafka-topology-builder.sh --brokers localhost:9093 --clientConfig topology.properties --topology topology_docs.yaml 
log4j:WARN No appenders could be found for logger (org.apache.kafka.clients.admin.AdminClientConfig).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" java.lang.NullPointerException
        at com.purbon.kafka.topology.api.mds.MDSApiClient.getClusterIds(MDSApiClient.java:202)
        at com.purbon.kafka.topology.roles.AdminRoleRunner.forKafkaConnect(AdminRoleRunner.java:54)
        at com.purbon.kafka.topology.roles.RBACProvider.setAclsForConnect(RBACProvider.java:61)
        at com.purbon.kafka.topology.AccessControlManager.syncApplicationAcls(AccessControlManager.java:157)
        at com.purbon.kafka.topology.AccessControlManager.lambda$sync$2(AccessControlManager.java:107)
        at java.util.ArrayList.forEach(ArrayList.java:1257)
        at com.purbon.kafka.topology.AccessControlManager.lambda$sync$3(AccessControlManager.java:105)
        at java.util.ArrayList.forEach(ArrayList.java:1257)
        at com.purbon.kafka.topology.AccessControlManager.sync(AccessControlManager.java:74)
        at com.purbon.kafka.topology.KafkaTopologyBuilder.run(KafkaTopologyBuilder.java:87)
        at com.purbon.kafka.topology.BuilderCLI.processTopology(BuilderCLI.java:154)
        at com.purbon.kafka.topology.BuilderCLI.main(BuilderCLI.java:118)

RBAC provider should support acls status sync

Currently the RBAC provider does not support an async a ROLES async cleanup, leaving only to map in the current defined roles, but no clean up is done automatically.

For this to happen, we need to implement:

Have an ansible role, that wrap the functionality

As a devops team, it could be of lot of valuable to have an ansible role that run the tool automatically.

one can directly call the sh script available when installing the rpm page like this.

docker run purbon/kafka-topology-builder:latest kafka-topology-builder.sh  --help
Parsing failed cause of Missing required options: topology, brokers, clientConfig
usage: cli
    --allowDelete          Permits delete operations for topics and
                           configs.
    --brokers <arg>        The Apache Kafka server(s) to connect to.
    --clientConfig <arg>   The AdminClient configuration file.
    --help                 Prints usage information.
    --quite                Print minimum status update
    --topology <arg>       Topology config file.

this could later be as well integrated into cp-ansible.

Generate yaml of existing topic.

Hi,
it will be nice to have options that Kafka Topology Builder can generate yaml of existing topic/acl in Kafka Cluster. It will be useful with migration from imperative create topic to declarative one like this project. In case you forget about same topic it could remove data.

Support Confluent Cloud CLI

As a user of the topology builder, I would like to have a more complete ccloud mode, adding support for ccloud CLI.

NOTE: Currently Confluent Cloud can be used through the benefits of AdminClient API, users can manage topics and acls, for example.

Add a "dry run" parameter

Many cli tools have a "dry run" option (sometimes -n) which does everything except executing the modifications. This should be a very small change, but useful for the paranoid admin.

Topologies from different teams

The topology builder won't allow topologie files for different teams when using the "dir" option, ie specifying a directory path with the --topology flag.
Being able to use that however- would be imho highly useful.

Exception in thread "main" java.io.IOException: Topologies from different teams are not allowed
        at com.purbon.kafka.topology.KafkaTopologyBuilder.buildTopology(KafkaTopologyBuilder.java:101)
        at com.purbon.kafka.topology.KafkaTopologyBuilder.run(KafkaTopologyBuilder.java:78)
        at com.purbon.kafka.topology.BuilderCLI.processTopology(BuilderCLI.java:154)
        at com.purbon.kafka.topology.BuilderCLI.main(BuilderCLI.java:118)

EDIT: When applying single topolgy files for different teams it won't work neither - only the first file will be "executed"

Reduce number of ACLs created by moving to prefixed ACLs

Given the limitations on the total number of ACLs in Confluent Cloud
(currently 1,000 for basic and standard, and 10,000 for dedicated cluster)
and when use centralized ACLs (currently 1,000 per cluster), we should strive to optimize the
total number of ACLs being created.

  • Currently, we create one literal ACL for each consumer/producer and each topic in a project for the necessary operations (READ/WRITE and DESCRIBE).
    This could be replaced with a single prefixed ACL for each consumer/producer and operation combination.

  • READ entries for consumer groups be prefixed instead of providing wildcard access by default.

Kafka Topology Builder 1.0-RC1

The first release candidate for 1.0 has been cut today, the process would look like:

  1. At least 3 release candidates to iron out as much test as possible. If they are all not need at the end, 1.0.0 might come earlier, but not necessary. If more are need there will be.
  2. Have a voting and betting process with the help of the community. Help is very welcome help testing the functionality meet the bare minimum expectations people using gitiops for Apache Kafka looks like. Your help is very welcome.

more details about the process can be found in our wiki -> https://github.com/purbon/kafka-topology-builder/wiki/1.0-RC-tests.

this issue has been created to track the questions and votes related to the 1.0.0 release.

NOTE: If you find bugs, please open them as separate issues in this project.

Thanks a lot for your help,

Delete ACLs

What is the proposed way to remove ACLs using the Topology Builder?

allowDelete will delete internal topics

Hi - when running the topology-builder with allowDelete the cli tools seems stuck for quite a while.
Looking at the output of the topics afterwards seems to show, that actually ALL topics, unless described in the descriptor are gone, including eg control-center topics.
That explains also why the control-center crashes afterwards.

I propose the allowDelete should ONLY touch topics that are not internal.
Or maybe only work in the project-space(s) the descriptor is containing.

Also allowDelete seems not to change Role assignments, is that correct?

Use cluster defaults when no number of partitions and replication factor is given in topology

When no replication factor and number of partitions are specified, the cluster defaults (broker configs num.partitions and replication.factor should be used.

This would be especially beneficial for Confluent Cloud use cases (only possible replication factor is 3).

Alternatively, a configuration exception should be used.

Current behavior in this case: use RF=2 and num.partitions=3 as fallback (hard-coded).

Preliminary implementation: https://github.com/christophschubert/kafka-topology-builder/tree/add-cluster-defaults

Schemas management failed when schemas are outside execution context

When topology-builder runs in docker context, schema management feature failed if schemas are outside of the docker execution context.

  • java -Dlog4j.configuration=file:/var/lib/jenkins/workspace/myProject/tmp/log4j.properties -cp /var/lib/jenkins/workspace/myProject/dev/topology/configuration.yaml /var/lib/jenkins/workspace/myProject/dev/topology/test-value.avsc -jar /usr/local/kafka-topology-builder/bin/kafka-topology-builder.jar --clientConfig /var/lib/jenkins/workspace/myProject/dev/kafka-admin.properties --topology /var/lib/jenkins/workspace/myProject/dev/topology/configuration.yaml --brokers kafka-dev.mydomain.com:9094
    Error: Could not find or load main class .var.lib.jenkins.workspace.myProject.dev.topology.test-value.avsc

Let me know if you want more information

Extend the number of top level attributes in the descriptor file

Currently the descriptor file can only select team and source, however many teams might like to customise this structure much deeper.

As a user of the topology tool I would like to have a variable level of top level attributes.

They should compose first to last, as of today.

Add support for handling more than 1 topology per call

Currently the Kafka Topology Builder receive as a parameter a single, however when the file growth it should be interesting and possible to receive a directory and that internally the tool handled the wrapping of all content into a single description.

QQ: - How to manage project structure/directories meanings, reading multiple files in order or without order? etc.

Removal of acls is not working

I've found a bug related to removal of acls after a change in the topology, e.g. removal of a consumer on a topic will not remove the acls for that user. As I understand the code I cannot see that removal of acls is working for any case actually. The problem is originally in AccessControlManager:

public void clearAcls() {
try {
clusterState.load();
if (allowDelete) {
plan.add(new ClearAcls(controlProvider, clusterState));
}
} catch (Exception e) {
LOGGER.error(e);
} finally {
if (allowDelete && !dryRun) {
clusterState.reset();
}
}
}

ClusterState is given to the ClearAcls action, but in the finally block the clusterState is reset and all bindings that ClearAcls should later remove are cleared. So when the ClearAcls action is run there are no bindings to remove acls for.

It was initially a bit confusing that there was a test in AccessControlManagerIT named testAclsCleanup that seemed to verify the acl removal feature. But this test is actually broken because when executing accessControlManager.apply() in the test there are actually three actions that are executed in this order:

  1. ClearAcls
  2. SetAclsForConsumer
  3. ClearAcls

So when the last ClearAcls action is run then the clusterState is again populated with the 3 bindings from SetAclsForConsumer...so they will be removed.

I already have a fork with a fix for this bug and I'm happy to contribute a PR for the fix.

Check Kafka Cluster ID

If you create a properties file containing valid Settings to connect to a cluster as admin client, however INCORRECT values for the MDS part, you won't get an error.

example:

This is my correct kafka cluster id:

 kafka-cluster-id: abcd

This is the topologybuilder.properties file:

[...]
topology.builder.mds.kafka.cluster.id=efgh # --> incorrect

--> Topics will be creates without a problem, however there is no error message whatsoever that ACLs/Roleassignments could not be done.

Make internal topic and group.id configurable for RBACProvider

RBACProvider implementation currently assumes that internal topics and group id for connect (https://github.com/purbon/kafka-topology-builder/blob/master/src/main/java/com/purbon/kafka/topology/roles/RBACProvider.java#L82-L87) and schema registry (https://github.com/purbon/kafka-topology-builder/blob/master/src/main/java/com/purbon/kafka/topology/roles/RBACProvider.java#L171-L174) are the default one.

It would be great to make this configurable.
This is particularly true for connect, as we can have multiple connect cluster and this requires using different group.id + different topics.

Configuration from environment variables

It would be good for the tool to support reading configuration from environment variables.

This could be done by exposing command-line options for each of the sensitive configuration options like sasl.jaas.config or by having a convention where environment variables with a certain prefix a converted to client properties.

Improve documentation

The documentation could be polished by

  • amending the example.yaml with explanation what single block will actually do
  • explaining the "allowDelete" function in more detail
  • explaining the possibility to use multiple topology files at once
  • explaining the possibility to use a dirctory instead of single topology files
  • explaining the platform rbac feature and stating the limitations

Managing Existing Cluster [Users, Topics, ACLs are preset] with kafka-topology-builder

Hallo, @purbon i was present at your recent talk in Linuxstamtisch. And excited to tryout kafka-topology-builder.

But I am bit afraid! I am afraid of damaging our existing Setup. We have a running Kafka cluster with some Users, Topics, and ACLs.

I fear if i try with a test.yml with some test data, then it may overwrite all other previous data, that is what i meant by GitOps and 'Single Point of Truth'.

Is there a way, i can retrive the Old Settings as a base.yml file and test further by adding something to it.

Please feel free to ask incase I am not clear with my intention.

Warm regards,
Kalinga Ray

Cannot deploy topologies - no proper log

Hi - I am using the topology-manager to deploy a basic set of topologies:

---
team: "team"
source: "source"
projects:
  - name: "bar"
    zookeepers: []
    consumers: []
    streams: []
    connectors: []
    topics:
      - dataType: "json"
        name: "events"
        config:
          replication.factor: "1"
          num.partitions: "1"
      - dataType: "avro"
        name: "events"
        config:
          replication.factor: "1"
          num.partitions: "1"

I am using the following config:

bootstrap.servers=localhost:9093
sasl.mechanism=PLAIN
security.protocol=SASL_PLAINTEXT
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
   username="admin" \
   password="admin-secret";

I can succesfully use the exact same file-Config in a kafka-topics command:

kafka-topics --command-config topology.properties --bootstrap-server localhost:9093 --list

However the topology-manager won't work and I get the following log output:

# kafka-topology-builder.sh --broker localhost:9093 --clientConfig topology.properties --topology topology.yaml 

log4j:WARN No appenders could be found for logger (org.apache.kafka.clients.admin.AdminClientConfig).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" java.lang.NullPointerException
        at com.purbon.kafka.topology.serdes.JsonSerdesUtils.parseApplicationUser(JsonSerdesUtils.java:17)
        at com.purbon.kafka.topology.serdes.ProjectCustomDeserializer.deserialize(ProjectCustomDeserializer.java:58)
        at com.purbon.kafka.topology.serdes.ProjectCustomDeserializer.deserialize(ProjectCustomDeserializer.java:21)
        at com.fasterxml.jackson.databind.ObjectMapper._readValue(ObjectMapper.java:4189)
        at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2476)
        at com.fasterxml.jackson.databind.ObjectMapper.treeToValue(ObjectMapper.java:2929)
        at com.purbon.kafka.topology.serdes.JsonSerdesUtils.parseApplicationUser(JsonSerdesUtils.java:19)
        at com.purbon.kafka.topology.serdes.JsonSerdesUtils.addProject2Topology(JsonSerdesUtils.java:28)
        at com.purbon.kafka.topology.serdes.TopologyCustomDeserializer.deserialize(TopologyCustomDeserializer.java:44)
        at com.purbon.kafka.topology.serdes.TopologyCustomDeserializer.deserialize(TopologyCustomDeserializer.java:18)
        at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4218)
        at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3079)
        at com.purbon.kafka.topology.serdes.TopologySerdes.deserialise(TopologySerdes.java:30)
        at com.purbon.kafka.topology.KafkaTopologyBuilder.parseListOfTopologies(KafkaTopologyBuilder.java:126)
        at com.purbon.kafka.topology.KafkaTopologyBuilder.buildTopology(KafkaTopologyBuilder.java:95)
        at com.purbon.kafka.topology.KafkaTopologyBuilder.run(KafkaTopologyBuilder.java:77)
        at com.purbon.kafka.topology.BuilderCLI.processTopology(BuilderCLI.java:153)
        at com.purbon.kafka.topology.BuilderCLI.main(BuilderCLI.java:118)

Are there any means to get a proper log - or is there any other error in my usage?

Sign RPMs

Would it be possible to sign the RPMs that are built from the release pipeline?

Add Schema management support

The current proposal is to add schemas support for the project.

    topics:
      - name: "foo"
        config:
          replication.factor: "1"
          num.partitions: "1"
      - name: "bar"
        dataType: "avro" # TODO What for is this field?
        schemas:
          key.schema.string: '{\"type\": \"string\"}'
          key.schema.type: "AVRO"
          # TODO key.schema.file
          value.schema.string: '{\"type\": \"string\"}'
          value.schema.type: "AVRO"
          # TODO value.schema.file
#...

so there are 2 alternative ways to define schemas: string or file.

After CP 5.5 schema type is an essential field as we need to support JSON, Protopuf & AVRO out of the box.

Perhaps, add support for custom schema types later on.

ClassNotFoundException: com.purbon.topology.roles.RBACProvider

Hi I am getting the following error:

kafka-topology-builder.sh --broker localhost:9093 --clientConfig topology.properties --topology topology.yaml 
log4j:WARN No appenders could be found for logger (org.apache.kafka.clients.admin.AdminClientConfig).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" java.io.IOException: java.lang.ClassNotFoundException: com.purbon.topology.roles.RBACProvider
        at com.purbon.kafka.topology.KafkaTopologyBuilder.buildAccessControlProvider(KafkaTopologyBuilder.java:208)
        at com.purbon.kafka.topology.KafkaTopologyBuilder.run(KafkaTopologyBuilder.java:78)
        at com.purbon.kafka.topology.BuilderCLI.processTopology(BuilderCLI.java:153)
        at com.purbon.kafka.topology.BuilderCLI.main(BuilderCLI.java:118)
Caused by: java.lang.ClassNotFoundException: com.purbon.topology.roles.RBACProvider
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:264)
        at com.purbon.kafka.topology.KafkaTopologyBuilder.buildAccessControlProvider(KafkaTopologyBuilder.java:177)

topology.yaml

team: "team"
source: "source"
projects:
  - name: "bar"
    zookeepers: []
    consumers:
      - principal: "User:Blub"
    producers: []
    streams: []
    connectors: []
    topics:
      - dataType: "json"
        name: "events"
        config:
          replication.factor: "1"
          num.partitions: "1"
    rbac:
      - ResourceOwner:
          - principal: "User:Foo"

topology.properties

bootstrap.servers=localhost:9093
sasl.mechanism=PLAIN
security.protocol=SASL_PLAINTEXT
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
   username="admin" \
   password="admin-secret";

topology.builder.access.control.class="com.purbon.topology.roles.RBACProvider"
topology.builder.mds.server="http://localhost:8090"
topology.builder.mds.user="alice"
topology.builder.mds.password="alice-secret"
topology.builder.mds.kafka.cluster.id="UtBZ3rTSRtypmmkAL1HbHw"

When leaving the RBAC-related out of topology properties and yaml file the deployment works like a charm.
Method of deploying was copying descriptor.yaml and topologies file into a docker container running your latest build from dockerhub.

Should topologies with only topics be supported?

Currently it is possible to use a topology with only topics. While this might be of interest for setting up environments, it is clearly not a good practise for production like environments.

example topology:

---
team: "team"
source: "source"
projects:
  - name: "foo"
    topics:
      - name: "foo"
        config:
          replication.factor: "1"
          num.partitions: "1"
      - dataType: "avro"
        name: "bar"
        config:
          replication.factor: "1"
          num.partitions: "1"
  - name: "bar"
    topics:
      - dataType: "avro"
        name: "bar"
        config:
          replication.factor: "1"
          num.partitions: "1"

questions:

  • should that even be supported?
  • should this be supported, but raise a HUGE warning to notice the problem?

thoughts?

Allow customized topic name format

Add a topicNameFormat configuration element with tokens that pull in environment variables, and also ones that correspond to the config elements. That way, if people wanted to tweak the default conventions, it's configurable.
e.g.
topicNameFormat={env.CP_ENVIRONMENT_NAME}-{team}-{source}-{project}-{topic}

Debian package name

Debian package (.deb) is not installable due to a non accepted pacakage name.

Seems that you placed the package description in the package name value, spaces are not accepted.

Adding validation rules

Hi,

As an OPS I may want to add some restrictions on the topology itself like :

  • a maximum number of partition
  • maximum retention
  • enforce the group.id naming with regexp or things like that
  • enforcing that a public topic (based on naming convention) have a schema

Even if most of it can be done via some scripting, it would be really awesome to get this built-in in the tool.
Maybe adding a --validator option through the CLI or something like that.
If the validation rules are OK => perform the action and return 0 error code.
If the validation rules are KO => print the error and stop before doing any actions.

This may help people in their CI/CD integration.

Ideally, the validation should be a public interface so it can be customized outside of the common validation rules that we see.

WDYT about that?

Cheers,
Jean-Louis

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.