GithubHelp home page GithubHelp logo

naveego / dataflow-contracts Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 180 KB

JSONSchemas, protobuf definitions, OpenAPI files, etc for establishing shared contracts related to data flow.

License: Apache License 2.0

Go 22.84% C# 77.16%

dataflow-contracts's People

Contributors

chriscerk avatar roehlerw avatar steveruble avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

dataflow-contracts's Issues

All Nullable properties should have type: [ "null" , {propertyType}]

When generating C# Models with NJsonSchema, Newtonsoft.Json.Required.DisallowNull is added to non-required properties which throws serialization exceptions for when properties contain a valid null value.

NJsonSchema doesn't have a setting to override null handling behavior currently: RicoSuter/NJsonSchema#838

Going to ensure non-required properties for dataflow contracts has type: [ "null" , {propertyType}] in the meantime. (and since it seems to be common to add null to the type in JSON Schema)

Invalid BatchEnd's reason(enum) from vandelay.go-between.ingestion topic

Currently BatchEnd (and any other message with a null enum) will not be processed by dataflow-server since a null enum cannot be deserialized.

kafka message:

{
  "bid": "bgusntnfq0sh4459m4v0",
  "cid": "bgusnuvfq0sh4459m5e0",
  "d": { "typ": "BatchEnd", "count": 10, "message": "", "reason": "" },
  "id": "bgusnuvfq0sh4459m5eg",
  "jid": "bgywpb99wwfg000gy6k0",
  "rid": "",
  "sid": "",
  "tid": "vandelay"
}
2019-01-15T12:03:08.240309369Z fail: Dataflow.Server.ProcessorOrchestrator[0]
2019-01-15T12:03:08.240331778Z       Error deserializing message **supplied above** while consuming / processing message, removing from Kafka Topic vandelay.go-between.ingestion: Error converting value "" to type 'Dataflow.Server.Models.BatchEndReason'. Path 'reason'.
2019-01-15T12:03:08.240339912Z Newtonsoft.Json.JsonSerializationException: Error converting value "" to type 'Dataflow.Server.Models.BatchEndReason'. Path 'reason'. ---> System.ArgumentException: Must specify valid information for parsing in the string.

Metabase-shape-resource needs updating

background

Unable to serialize ResourceUpdated Payload from vandelay.metabase.shapes.changes in dataflow-server due to deserialization missing required grouping property (

)

error

Error deserializing message *message below*
      while consuming / processing message, removing from Kafka Topic vandelay.metabase.shapes.changes: Required property 'grouping' not found in JSON. Path 'matchRule'.

message

{
  "id": "bh333agt000000ei4gl0",
  "tid": "vandelay",
  "sid": "nv1:n5o.red::metabase:self/[email protected]",
  "rid": "nr1:n5o.red:vandelay:metabase:shapes/bh31r78jvh50000pzjv0",
  "rids": [],
  "cid": "bh333agt000000ei4gkg",
  "m": {},
  "d": {
    "typ": "ResourceUpdated",
    "b": {
      "typ": "MetabaseShapeResource",
      "id": "bh31r78jvh50000pzjv0",
      "version": 3,
      "name": "SQL User",
      "description": "/home/chris/Documents/sql-biannual-database-permission-overview-stonebridge.csv",
      "properties": [
        {
          "id": "bh31r78jvh50000pzjvg",
          "name": "permission",
          "description": null,
          "type": "String",
          "isUnique": false,
          "security": null,
          "isNullable": false
        },
        {
          "id": "bh31r78jvh50000pzjw0",
          "name": "user",
          "description": null,
          "type": "String",
          "isUnique": false,
          "security": null,
          "isNullable": false
        },
        {
          "id": "bh31r78jvh50000pzjwg",
          "name": "database",
          "description": null,
          "type": "String",
          "isUnique": false,
          "security": null,
          "isNullable": false
        }
      ],
      "labels": {},
      "isMdmShape": true,
      "matchRule": {
        "shapeId": "bh31r78jvh50000pzjv0",
        "version": 1,
        "mcl": 1.0,
        "type": "jaro-winkler",
        "settings": { "properties": ["bh31r78jvh50000pzjw0"] }
      },
      "mergeRule": { "version": 0, "properties": {} },
      "copiedFromSchemaId": "bh31qx7jvh50000dvbw0",
      "createdAt": "2019-01-21T19:23:09.674Z",
      "createdBy": "jeGlKZctiebaOX1plqJyFIfHCG",
      "updatedAt": "2019-01-21T19:24:06.282Z",
      "updatedBy": "jeGlKZctiebaOX1plqJyFIfHCG",
      "deletedAt": null,
      "deletedBy": null
    },
    "a": {
      "typ": "MetabaseShapeResource",
      "id": "bh31r78jvh50000pzjv0",
      "version": 4,
      "name": "SQL User",
      "description": "/home/chris/Documents/sql-biannual-database-permission-overview-stonebridge.csv",
      "properties": [
        {
          "id": "bh31r78jvh50000pzjvg",
          "name": "permission",
          "description": "",
          "type": "String",
          "isUnique": false,
          "security": null,
          "isNullable": false
        },
        {
          "id": "bh31r78jvh50000pzjw0",
          "name": "user",
          "description": null,
          "type": "String",
          "isUnique": false,
          "security": null,
          "isNullable": false
        },
        {
          "id": "bh31r78jvh50000pzjwg",
          "name": "database",
          "description": null,
          "type": "String",
          "isUnique": false,
          "security": null,
          "isNullable": false
        }
      ],
      "labels": {},
      "isMdmShape": true,
      "matchRule": {
        "shapeId": "bh31r78jvh50000pzjv0",
        "version": 1,
        "mcl": 1.0,
        "type": "jaro-winkler",
        "settings": { "properties": ["bh31r78jvh50000pzjw0"] }
      },
      "mergeRule": {
        "version": 0,
        "properties": {
          "bh31r78jvh50000pzjvg": {
            "propertyId": "bh31r78jvh50000pzjvg",
            "connections": ["bh31qmyjvh50000dvbvg"]
          },
          "bh31r78jvh50000pzjw0": {
            "propertyId": "bh31r78jvh50000pzjw0",
            "connections": ["bh31qmyjvh50000dvbvg"]
          },
          "bh31r78jvh50000pzjwg": {
            "propertyId": "bh31r78jvh50000pzjwg",
            "connections": ["bh31qmyjvh50000dvbvg"]
          }
        }
      },
      "copiedFromSchemaId": "bh31qx7jvh50000dvbw0",
      "createdAt": "2019-01-21T19:23:09.674Z",
      "createdBy": "jeGlKZctiebaOX1plqJyFIfHCG",
      "updatedAt": "2019-01-21T20:55:06.322Z",
      "updatedBy": "jeGlKZctiebaOX1plqJyFIfHCG",
      "deletedAt": null,
      "deletedBy": null
    },
    "m": {}
  }
}

dq-rules

dq-rules

required by naveegoinc/stories#867

Non nullable properties on Datapoint

Enriched is not currently nullable on the Datapoint, this is causing deserialization issues in the dataflow server.

Either this issue is valid or https://github.com/naveegoinc/stories/issues/147 needs to be revisited to change the matching service to set the enr.

�[41m�[30mfail�[39m�[22m�[49m: Dataflow.Server.ProcessorOrchestrator[0]
      Error deserializing message *message below*,"ruleId":"bg81p4g9wwfg000bd37g"}} while consuming / processing message, removing from Kafka Topic vandelay.matching.groups: Required property 'enr' expects a non-null value. Path ''.
Newtonsoft.Json.JsonSerializationException: Required property 'enr' expects a non-null value. Path ''.
{
  "rid": "bgnme2f4lrug318004ig",
  "id": "bgnme4f4lrug318009og",
  "cid": "bgnme4f4lrug318009p0",
  "tid": "vandelay",
  "d": {
    "matchData": "16492",
    "id": "bgnme2f4lrug318004ig",
    "typ": "MatchGroup",
    "dataPoints": [
      {
        "rid": "PqhZZy26L08DnSZiXVVuIArPZut6H56DUXftumo8lBA",
        "jid": "bggjs8d9wwfg000yzrt0",
        "m": null,
        "rids": null,
        "id": "bggiqekeasq6hkl1aot0",
        "cid": "bggiqekeasq6hkl1aosg",
        "trc": null,
        "tid": "vandelay",
        "bid": "bggipdceasq6hkl0rekg",
        "sid": "",
        "d": {
          "s": "bg81p4g9wwfg000bd37g",
          "a": "ups",
          "enr": null,
          "typ": "DataPoint",
          "c": "bggjrd99wwfg0005m0xg",
          "d": {
            "bg81p4g9wwfg000bd380": "16492",
            "bg81p4g9wwfg000bd3e0": "2008-05-31T00:00:00Z",
            "bg81p4g9wwfg000bd390": "false",
            "bg81p4g9wwfg000bd39g": null,
            "bg81p4g9wwfg000bd3bg": null,
            "bg81p4g9wwfg000bd3ag": "E",
            "bg81p4g9wwfg000bd38g": "IN",
            "bg81p4g9wwfg000bd3c0": "0",
            "bg81p4g9wwfg000bd3dg": "JlkeqfcjPUuxHJv7sYwY6A==",
            "bg81p4g9wwfg000bd3b0": "Mitchell",
            "bg81p4g9wwfg000bd3d0": "<IndividualSurvey xmlns=\"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/IndividualSurvey\"><TotalPurchaseYTD>2452.34</TotalPurchaseYTD><DateFirstPurchase>2004-05-31Z</DateFirstPurchase><BirthDate>1972-12-02Z</BirthDate><MaritalStatus>M</MaritalStatus><YearlyIncome>50001-75000</YearlyIncome><Gender>M</Gender><TotalChildren>1</TotalChildren><NumberChildrenAtHome>0</NumberChildrenAtHome><Education>Graduate Degree</Education><Occupation>Professional</Occupation><HomeOwnerFlag>1</HomeOwnerFlag><NumberCarsOwned>0</NumberCarsOwned><CommuteDistance>0-1 Miles</CommuteDistance></IndividualSurvey>",
            "bg81p4g9wwfg000bd3a0": "Jordan",
            "bg81p4g9wwfg000bd3cg": null
          }
        }
      }
    ],
    "ruleId": "bg81p4g9wwfg000bd37g"
  }
}

Add bulk fields for batch start and end

Add bulk fields for batch start and end

Parent Issue (as of when this issue was created) naveegoinc/stories#1696

Story

As a Naveego Platform user
I would like to bulk load my transactional replication data
So that I can achieve better replication performance for transactional data replications

Acceptance Criteria

GIVEN a transactional GR
WHEN a connection is being created or selected for a replication job
THEN only plugins flagged for bulk replication capability are available to the user
AND the bulk option is automatically enabled for that plugin

GIVEN an active bulk replication job
WHEN data is queued up for replication
THEN it is replicated using 'bulk' options for better performance

Additional Context

Existing plugins that need this option added
Oracle Autonomous Data Warehouse
Future Plugins that will need this option:
Snowflake
Redshift
Azure SQL Data WH

Ideas from grooming / backlog checkup

Ingest -> Kafka Topic -> job that reads kafka topic switching from regular mode to bulk load
Need ability to load plugins from private repositories or manually publish them

  • requires naveegoinc/go-between#307
  • requires naveegoinc/nucleus#7
  • requires naveegoinc/nucleus#9
  • requires naveegoinc/nucleus#10
  • requires naveegoinc/nucleus#13
  • requires naveegoinc/go-between#314

required by naveegoinc/stories#1696

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.