GithubHelp home page GithubHelp logo

clin's People

Contributors

amrabdelshafi97 avatar dependabot[bot] avatar dlippok avatar dmitry-erokhin avatar dstockhammer avatar herojan avatar lukasniemeier-zalando avatar perploug avatar ratabayev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

clin's Issues

Set up CI/CD pipeline

One-line summary

Set up CI/CD pipeline to build, test and publish to PyPI

Description

Set up a pipeline that builds and tests all pull requests. When a release is created, the app is packaged and pushed to PyPI: https://pypi.org/project/clin

Open Questions:

  • How to do versioning?
  • How to create releases?

Types of Changes

  • Documentation / non-code

Tasks

  • Build and test on master and on pull requests
  • Decide on a versioning strategy and implement creating releases
  • Decide on a linter and add it to the pipeline. I propose black
  • Push releases to PyPI

Verbose Mode

It would be great to have verbose mode in the tool.
When you run it now, you see:

⦿ Will update: event type event.type.1
⦿ Will update: event type event.type.2
⦿ Will update: event type event.type.3
...
✔ Up to date: event type event.type.4
...

but it is not entirely clear what kind of changes are going to happen in ⦿ Will update ...

In this way it would be helpful to see the diff between the current state and new one.

Modifying the SQL of a nakadi-sql query fails

Expected Behavior

When modifying the SQL of a Nakadi SQL query with clin apply the request either successfully updates (if the SQL change is valid) or fails with an error as to why the SQL change is not valid (i.e. the error coming from Nakadi).

Actual Behavior

Updating the SQL results in an error:

× Modifying output event type is forbidden: ...

It looks like clin is trying to update the whole event, and not using the /sql sub-resource of the query to update the query itself.

Steps to Reproduce the Problem

  1. Create a Nakadi SQL query
    • clin apply -t ... -e ... docs/examples/single/sql-query.yaml -X)
  2. Change the WHERE portion of the query
    • e.g.
diff --git a/docs/examples/single/sql-query.yaml b/docs/examples/single/sql-query.yaml
index 5c2d0f2..fb90be2 100644
--- a/docs/examples/single/sql-query.yaml
+++ b/docs/examples/single/sql-query.yaml
@@ -4,7 +4,7 @@ spec:
   sql: |
     SELECT *
     FROM "derokhin.clin.test" AS e
-    WHERE e."important_key" = 'hello world'
+    WHERE e."important_key" = 'hello world' OR e."important_key" = 'new phone. Who dis?'
   envelope: false
   outputEventType:
     category: business # business | data | undefined
  1. Update the Nakadi SQL query
    • clin apply -t ... -e ... docs/examples/single/sql-query.yaml -X)

Specifications

  • Version: 1.2.2
  • Platform: OSx 10.15
  • Subsystem: Python 3.9.1

clin doesn't read the event files in order

Expected Behavior

To read nakadi events yaml files in ordered way

Actual Behavior

so this source.glob("*.yaml") method doesn't return the yaml files ordered, so in our case, we have dependent nakadi sql queries that should be executed in the right order.

Specifications

  • Version:1.4.3

ERROR: Remote end closed connection without response

When running clin process -d --env=staging -t $TOK ./nakadi/mops/paas.clin.yaml and there are lots of event-types / subsciptions to process, sometimes clin is failing with an exception, (see below).

EDIT: hm. somehow failing just a lot on TEST-env. Already made 8 attempts and it is failing sooner or later

Expected Behavior

If possible it would be great to avoid termination of the app and have some retry logic inside (or increase the amount of retries if they already present (?) ).

Actual Behavior

Just was running clin several times in a row and it was failing with the following errors:

('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Traceback (most recent call last):
  File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen
    chunked=chunked,
  File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 426, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 421, in _make_request
    httplib_response = conn.getresponse()
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/http/client.py", line 1373, in getresponse
    response.begin()
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/http/client.py", line 319, in begin
    version, status, reason = self._read_status()
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/http/client.py", line 288, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/requests/adapters.py", line 450, in send
    timeout=timeout
  File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 725, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/util/retry.py", line 403, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/packages/six.py", line 734, in reraise
    raise value.with_traceback(tb)
  File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen
    chunked=chunked,
  File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 426, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/home/gchudnov/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 421, in _make_request
    httplib_response = conn.getresponse()
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/http/client.py", line 1373, in getresponse
    response.begin()
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/http/client.py", line 319, in begin
    version, status, reason = self._read_status()
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/http/client.py", line 288, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/clin/run.py", line 184, in process
    processor.apply(task.target, task.envelope)
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/clin/processor.py", line 55, in apply
    apply(env, envelope.spec)
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/clin/processor.py", line 147, in apply_subscription
    sub.event_types, sub.owning_application, sub.consumer_group
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/clin/clients/nakadi.py", line 84, in get_subscription
    resp = requests.get(url, headers=self._headers, params=params)
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/requests/sessions.py", line 529, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/home/gchudnov/.pyenv/versions/3.7.13/lib/python3.7/site-packages/requests/adapters.py", line 501, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

Steps to Reproduce the Problem

  1. on test-environemtn
  2. yaml with lots of event-types / seubscriptions (?)
  3. clin process -d --env=staging -t $TOK ./nakadi/mops/paas.clin.yaml

Specifications

  • Version: 1.4.0
  • Platform: Ubuntu Linux 18.04

Support `compact_and_delete`

Description

Recenty Nakadi introduced compact_and_delete cleanup policy which is needed to be reflected in the cln tool

Acceptance criteria

  • manifest can specify compact_and_delete as a value for $.cleanup.policy
  • retention time for compact_and_delete policy still taken from $.cleanup.policy. retentionTimeDays
  • event types with compact_and_delete policy can be created/updated/dumped by clin
  • sample event types are updated
  • documentation is updated (if needed)

Error: Required Field In OutputEventType

Expected Behavior

To fetch the already exists sql-queries smoothly.

Actual Behavior

    return OutputEventType(
TypeError: __init__() missing 1 required positional argument: 'partition_compaction_key_field'```

## Steps to Reproduce the Problem

  1. create a sql-query
  2. try to re-create the same query

## Specifications

  - Version:1.4.2

Repartitioning via nakadi SQL does not work. Key is in wrong place in payload.

Expected Behavior

Creating a nakadi SQL query with repartition_parameters is possible when repartitioning is set in the spec (within outputEventType).

Actual Behavior

The repartitioning dict is successfully parsed, but output in the wrong place in the body of the POST to create the SQL query.

According to the Nakadi SQL documentation repartition_parameters is a key within output_event_type, not a top level key itself.

I've hacked a patch in to make the change and can confirm that the following change in JSON output works:

 {
   "id": "event-name.multiple-partitions",
   "sql": "SELECT * FROM \"event-name\" AS a",
   "envelope": false,
   "output_event_type": {
     "name": "event-name.multiple-partitions",
     "owning_application": "app
     "category": "business",
     "audience": "component-internal",
     "cleanup_policy": "delete",
-    "retention_time": 86400000
-  },
-  "repartition_parameters": {
-    "number_of_partitions": 6,
-    "partition_strategy": "hash",
-    "partition_key_fields": [
-      "config_sku"
-    ]
+    "retention_time": 86400000,
+    "repartition_parameters": {
+      "number_of_partitions": 6,
+      "partition_strategy": "hash",
+      "partition_key_fields": [
+        "config_sku"
+      ]
+    }
   }
}

Steps to Reproduce the Problem

  1. Add a repartitioning key to an event like so:
diff --git a/docs/examples/single/sql-query.yaml b/docs/examples/single/sql-query.yaml
index 5c2d0f2..7f5932b 100644
--- a/docs/examples/single/sql-query.yaml
+++ b/docs/examples/single/sql-query.yaml
@@ -13,6 +13,10 @@ spec:
    cleanup:
      policy: delete
      retentionTimeDays: 2
+    repartitioning:
+      partitionCount: 6
+      strategy: hash
+      keys: ["important_key"]
  auth:
    users:
      admins:

Side note: This may be good to add to the documentation/examples. It wasn't immediately clear how to specify this.

  1. Apply the spec
clin apply -e ... -t ... docs/examples/single/sql-query.yaml -v -p -X
  1. Note the location of repartition_parameters

  2. The event will be created, but with the default number of parameters, not the ones you specified in repartitioning.

Specifications

  • Version: clin, version 1.3.0
  • Platform: mac osX 10.15.7
  • Subsystem: -

Validate if event-schema is valid json-schema

Expected Behavior

clin creates a valid json-schema from the yaml specs. In case the json-schema is not correct, fail fast and notify the user before (trying) to create/update an event-type in Nakadi.

This also allows for better error messages that hint the user to the actual error in the json-schema.

Actual Behavior

If the EventTypeSchema.schema is no valid json-schema, clin still posts to Nakadi and receives a 422 error.

Steps to Reproduce the Problem

  1. define yaml schema for an event-type that requires another yaml (e.g. Location: "@@@./definitions/location.yaml")
  2. remove the sub-schema linking
  3. generate json with clin and post to Nakadi (clin apply -X -e production -t $(token) wrong.yaml )

Specifications

  • Version: 1.0.0
  • Platform: Mac OS 10.15.x
  • Subsystem: Python 3.7.7

partition_compaction_key is required in sql-query

Expected Behavior

To create a successful nakadi sql-query

Actual Behavior

{"detail":"Compacted output event type requires partition_compaction_key","status":400,"title":"Bad Request"}

Steps to Reproduce the Problem

  1. create a sql-query.yaml
  2. don't specify partition_compaction_key
  3. just run clin as usual

Specifications

  • Version: 1.4.1

`Retention time` field for log-compacted events is not updated via Clin

Expected Behavior

Clin update event type retention policy for all event types, even log compacted (to the default value)

Actual Behavior

The retention time field is not pushed to the Nakadi for log compacted event types.

Description

As far as event type is log-compacted, the cleanup field is not actually used by Nakadi. However, it presents in the payloads leading to such divergences (which is impossible to fix via Clin):

⦿ Found 1 changes: event type _____
    values_changed:
      root.cleanup.retention_time_days:
        new_value: 1
        old_value: 4

Steps to Reproduce the Problem

  1. Create log-compacted event type in Nakadi via Clin
  2. Update retention policy via any HTTP client
  3. Reprocess file via Clin

Specifications

  • Version: 1.0.0
  • Platform: MacOS
  • Subsystem:

Dumping contains terminal colour codes

Expected Behavior

When dumping a schema to a file, I expect valid json or yaml output

e.g.

{
  "name": "my-event-type",
  "category": "data",
...

Actual Behavior

The output contains terminal colour codes, and out.json contains:

{
  �[38;2;0;128;0;01m"name"�[39;00m: �[38;2;186;33;33m"my-event-type"�[39m,
  �[38;2;0;128;0;01m"category"�[39;00m: �[38;2;186;33;33m"data"�[39m,

Steps to Reproduce the Problem

  1. clin dump -t "$token" -o json -e 'production' 'my-event-type' > out.json
  2. cat out.json

Specifications

  • Version: clin-1.0.1, Python 3.8.2
  • Platform: macOS Big Sur

Outdated security contact

Expected Behavior

Link in SECURITY.md leads to meaningful information on how to report security issues or apply for the Bug Bounty program on HackerOne.

Actual Behavior

The provided link leads to corporate contact page that does not instruct how to report bugs or apply for bug bounty program.

Steps to Reproduce the Problem

  1. Open SECURITY.md
  2. Click the link in the description (https://corporate.zalando.com/en/services-and-contact#security-form)

Specifications

  • Version: n/a
  • Platform: n/a
  • Subsystem: n/a

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.