Set of tools to materialize raw data, provided by Debezium CDC and landed with Kafka-Connect.
In this package there is an operation, which helps to generate landing tables for debezium sources.
Operation create_source_table
Generates the source table in database and prints the YAML for it to the output.
Arguments:
table_name
Required Type:string
. A name of a table to create.schema_name
Optional Type:string
orNone
. DefaultNone
. A name of a schema to create the table in. If empty, the schema would be taken from target. If the schema does not exist, the operation attempts to create it.role_name
Optional Type:string
orNone
. DefaultNone
. A name of a role to grant required permissions. IfNone
-- no permissions will be grant.db
Optional Type:string
orNone
. DefaultNone
. A name of a database to create the table and the schema. IfNone
-- the database from a profile is used. Taking effect only for platforms, supporting multi-database adapters.aliases
Optional Type:Dict[string,string]
orNone
. DefaultNone
. Default technical column aliases. Takes into the account keysrow_id
andloaded_at
. The values are names for the related columns. IfNone
or the keys are missed -- values are taken from varsrow_id__alias
andloaded_at__alias
. The default values for those arerolling_row_id
andsnowpipe_loaded_at
.dry_run
Optional Type:boolean
DefaultFalse
. If it should really apply the queries to the database or just prit it.
Example:
$ dbt run-operation create_source_table \
--args '{db: raw_layer, schema_name: kafka_connect, table_name: my_lending_table}'
To stage the table with an actual state the model should be defined with the following macro:
Macro to_stg_layer
Generates the model internals to stage a source with actual states.
Arguments:
source_ref
Required Type:Source
The Source definition.fields
Required Type:List[string | Dict[string,string] | Dict[string,Dict[string,boolean]]
. List of the fields not including the Primary Key field. Each field could be defined with a name (as a string), with single-key dict, containing the field name as the key and an expresssion to cast it to some type in the python format string format as the, or with dict{"is_date": true}
as value.incremental
Required Type:boolean
. The value ofis_incremental()
macro.source_engine
Required Type:string
The type of source database engine. Currently supports onlypostgresql
value.contains_prev_state
Optional Type:boolean
DefaultFalse
. If the source contains previous values. For more information see Debezium documentation.aliases
Optional Type:Dict[string,string]
orNone
. DefaultNone
. Default technical column aliases. Takes into the account keysis_deleted
,deleted_at
,ts
,row_id
andloaded_at
. The values are names for the related columns. IfNone
or the keys are missed -- values are taken from varsis_deleted__alias
,deleted_at__alias
,ts__alias
,row_id__alias
andloaded_at__alias
. The default values for those areis_deleted
, deleted_at,
ts_msrolling_row_id
andsnowpipe_loaded_at
.