naveego / dataflow-contracts Goto Github PK
View Code? Open in Web Editor NEWJSONSchemas, protobuf definitions, OpenAPI files, etc for establishing shared contracts related to data flow.
License: Apache License 2.0
JSONSchemas, protobuf definitions, OpenAPI files, etc for establishing shared contracts related to data flow.
License: Apache License 2.0
values-type-fix
required by naveegoinc/stories#878
Conditional merge rules
required by naveegoinc/stories#181
Add the changedData
on the CompositeRecord Payload
The CompositeRecord Payload on a Dataflow Event from MDM was renamed from mergeRuleId
-> shapeId
We should update the dataflow contracts so that all services communicate with the same language.
Original Error from dataflow-api
JsonSerializationException: Required property 'mergeRuleId' not found in JSON. Path ''.
We need to add version
to the MergeRule. See naveegoinc/metabase#40
When generating C# Models with NJsonSchema, Newtonsoft.Json.Required.DisallowNull is added to non-required properties which throws serialization exceptions for when properties contain a valid null value.
NJsonSchema doesn't have a setting to override null handling behavior currently: RicoSuter/NJsonSchema#838
Going to ensure non-required properties for dataflow contracts has type: [ "null" , {propertyType}]
in the meantime. (and since it seems to be common to add null to the type in JSON Schema)
Currently BatchEnd (and any other message with a null enum) will not be processed by dataflow-server since a null enum cannot be deserialized.
kafka message:
{
"bid": "bgusntnfq0sh4459m4v0",
"cid": "bgusnuvfq0sh4459m5e0",
"d": { "typ": "BatchEnd", "count": 10, "message": "", "reason": "" },
"id": "bgusnuvfq0sh4459m5eg",
"jid": "bgywpb99wwfg000gy6k0",
"rid": "",
"sid": "",
"tid": "vandelay"
}
2019-01-15T12:03:08.240309369Z fail: Dataflow.Server.ProcessorOrchestrator[0]
2019-01-15T12:03:08.240331778Z Error deserializing message **supplied above** while consuming / processing message, removing from Kafka Topic vandelay.go-between.ingestion: Error converting value "" to type 'Dataflow.Server.Models.BatchEndReason'. Path 'reason'.
2019-01-15T12:03:08.240339912Z Newtonsoft.Json.JsonSerializationException: Error converting value "" to type 'Dataflow.Server.Models.BatchEndReason'. Path 'reason'. ---> System.ArgumentException: Must specify valid information for parsing in the string.
matchData is being removed as it is an internal detail only and not needed to be emitted as a property on the MatchGroup Payload of a DataflowEvent
ss-contracts-update
required by naveegoinc/stories#1467
Update gRPC contracts for replication
required by naveegoinc/stories#376
required by naveegoinc/stories#188
required by naveegoinc/uat#69
We should remove the datapoints
folder to avoid confusion now that datapoints are defined under events
Unable to serialize ResourceUpdated
Payload from vandelay.metabase.shapes.changes
in dataflow-server due to deserialization missing required grouping
property (
Error deserializing message *message below*
while consuming / processing message, removing from Kafka Topic vandelay.metabase.shapes.changes: Required property 'grouping' not found in JSON. Path 'matchRule'.
{
"id": "bh333agt000000ei4gl0",
"tid": "vandelay",
"sid": "nv1:n5o.red::metabase:self/[email protected]",
"rid": "nr1:n5o.red:vandelay:metabase:shapes/bh31r78jvh50000pzjv0",
"rids": [],
"cid": "bh333agt000000ei4gkg",
"m": {},
"d": {
"typ": "ResourceUpdated",
"b": {
"typ": "MetabaseShapeResource",
"id": "bh31r78jvh50000pzjv0",
"version": 3,
"name": "SQL User",
"description": "/home/chris/Documents/sql-biannual-database-permission-overview-stonebridge.csv",
"properties": [
{
"id": "bh31r78jvh50000pzjvg",
"name": "permission",
"description": null,
"type": "String",
"isUnique": false,
"security": null,
"isNullable": false
},
{
"id": "bh31r78jvh50000pzjw0",
"name": "user",
"description": null,
"type": "String",
"isUnique": false,
"security": null,
"isNullable": false
},
{
"id": "bh31r78jvh50000pzjwg",
"name": "database",
"description": null,
"type": "String",
"isUnique": false,
"security": null,
"isNullable": false
}
],
"labels": {},
"isMdmShape": true,
"matchRule": {
"shapeId": "bh31r78jvh50000pzjv0",
"version": 1,
"mcl": 1.0,
"type": "jaro-winkler",
"settings": { "properties": ["bh31r78jvh50000pzjw0"] }
},
"mergeRule": { "version": 0, "properties": {} },
"copiedFromSchemaId": "bh31qx7jvh50000dvbw0",
"createdAt": "2019-01-21T19:23:09.674Z",
"createdBy": "jeGlKZctiebaOX1plqJyFIfHCG",
"updatedAt": "2019-01-21T19:24:06.282Z",
"updatedBy": "jeGlKZctiebaOX1plqJyFIfHCG",
"deletedAt": null,
"deletedBy": null
},
"a": {
"typ": "MetabaseShapeResource",
"id": "bh31r78jvh50000pzjv0",
"version": 4,
"name": "SQL User",
"description": "/home/chris/Documents/sql-biannual-database-permission-overview-stonebridge.csv",
"properties": [
{
"id": "bh31r78jvh50000pzjvg",
"name": "permission",
"description": "",
"type": "String",
"isUnique": false,
"security": null,
"isNullable": false
},
{
"id": "bh31r78jvh50000pzjw0",
"name": "user",
"description": null,
"type": "String",
"isUnique": false,
"security": null,
"isNullable": false
},
{
"id": "bh31r78jvh50000pzjwg",
"name": "database",
"description": null,
"type": "String",
"isUnique": false,
"security": null,
"isNullable": false
}
],
"labels": {},
"isMdmShape": true,
"matchRule": {
"shapeId": "bh31r78jvh50000pzjv0",
"version": 1,
"mcl": 1.0,
"type": "jaro-winkler",
"settings": { "properties": ["bh31r78jvh50000pzjw0"] }
},
"mergeRule": {
"version": 0,
"properties": {
"bh31r78jvh50000pzjvg": {
"propertyId": "bh31r78jvh50000pzjvg",
"connections": ["bh31qmyjvh50000dvbvg"]
},
"bh31r78jvh50000pzjw0": {
"propertyId": "bh31r78jvh50000pzjw0",
"connections": ["bh31qmyjvh50000dvbvg"]
},
"bh31r78jvh50000pzjwg": {
"propertyId": "bh31r78jvh50000pzjwg",
"connections": ["bh31qmyjvh50000dvbvg"]
}
}
},
"copiedFromSchemaId": "bh31qx7jvh50000dvbw0",
"createdAt": "2019-01-21T19:23:09.674Z",
"createdBy": "jeGlKZctiebaOX1plqJyFIfHCG",
"updatedAt": "2019-01-21T20:55:06.322Z",
"updatedBy": "jeGlKZctiebaOX1plqJyFIfHCG",
"deletedAt": null,
"deletedBy": null
},
"m": {}
}
}
dq-rules
required by naveegoinc/stories#867
Needed by naveegoinc/go-between#6
Add Multi-Stage Matching
required by naveegoinc/stories#209
Enriched is not currently nullable on the Datapoint, this is causing deserialization issues in the dataflow server.
Either this issue is valid or https://github.com/naveegoinc/stories/issues/147 needs to be revisited to change the matching service to set the enr
.
�[41m�[30mfail�[39m�[22m�[49m: Dataflow.Server.ProcessorOrchestrator[0]
Error deserializing message *message below*,"ruleId":"bg81p4g9wwfg000bd37g"}} while consuming / processing message, removing from Kafka Topic vandelay.matching.groups: Required property 'enr' expects a non-null value. Path ''.
Newtonsoft.Json.JsonSerializationException: Required property 'enr' expects a non-null value. Path ''.
{
"rid": "bgnme2f4lrug318004ig",
"id": "bgnme4f4lrug318009og",
"cid": "bgnme4f4lrug318009p0",
"tid": "vandelay",
"d": {
"matchData": "16492",
"id": "bgnme2f4lrug318004ig",
"typ": "MatchGroup",
"dataPoints": [
{
"rid": "PqhZZy26L08DnSZiXVVuIArPZut6H56DUXftumo8lBA",
"jid": "bggjs8d9wwfg000yzrt0",
"m": null,
"rids": null,
"id": "bggiqekeasq6hkl1aot0",
"cid": "bggiqekeasq6hkl1aosg",
"trc": null,
"tid": "vandelay",
"bid": "bggipdceasq6hkl0rekg",
"sid": "",
"d": {
"s": "bg81p4g9wwfg000bd37g",
"a": "ups",
"enr": null,
"typ": "DataPoint",
"c": "bggjrd99wwfg0005m0xg",
"d": {
"bg81p4g9wwfg000bd380": "16492",
"bg81p4g9wwfg000bd3e0": "2008-05-31T00:00:00Z",
"bg81p4g9wwfg000bd390": "false",
"bg81p4g9wwfg000bd39g": null,
"bg81p4g9wwfg000bd3bg": null,
"bg81p4g9wwfg000bd3ag": "E",
"bg81p4g9wwfg000bd38g": "IN",
"bg81p4g9wwfg000bd3c0": "0",
"bg81p4g9wwfg000bd3dg": "JlkeqfcjPUuxHJv7sYwY6A==",
"bg81p4g9wwfg000bd3b0": "Mitchell",
"bg81p4g9wwfg000bd3d0": "<IndividualSurvey xmlns=\"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/IndividualSurvey\"><TotalPurchaseYTD>2452.34</TotalPurchaseYTD><DateFirstPurchase>2004-05-31Z</DateFirstPurchase><BirthDate>1972-12-02Z</BirthDate><MaritalStatus>M</MaritalStatus><YearlyIncome>50001-75000</YearlyIncome><Gender>M</Gender><TotalChildren>1</TotalChildren><NumberChildrenAtHome>0</NumberChildrenAtHome><Education>Graduate Degree</Education><Occupation>Professional</Occupation><HomeOwnerFlag>1</HomeOwnerFlag><NumberCarsOwned>0</NumberCarsOwned><CommuteDistance>0-1 Miles</CommuteDistance></IndividualSurvey>",
"bg81p4g9wwfg000bd3a0": "Jordan",
"bg81p4g9wwfg000bd3cg": null
}
}
}
],
"ruleId": "bg81p4g9wwfg000bd37g"
}
}
Add bulk fields for batch start and end
As a Naveego Platform user
I would like to bulk load my transactional replication data
So that I can achieve better replication performance for transactional data replications
GIVEN a transactional GR
WHEN a connection is being created or selected for a replication job
THEN only plugins flagged for bulk replication capability are available to the user
AND the bulk option is automatically enabled for that plugin
GIVEN an active bulk replication job
WHEN data is queued up for replication
THEN it is replicated using 'bulk' options for better performance
Existing plugins that need this option added
Oracle Autonomous Data Warehouse
Future Plugins that will need this option:
Snowflake
Redshift
Azure SQL Data WH
Ingest -> Kafka Topic -> job that reads kafka topic switching from regular mode to bulk load
Need ability to load plugins from private repositories or manually publish them
required by naveegoinc/stories#1696
required by naveegoinc/stories#138
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.