Hello everyone, I'm working in a project with Confluent with the Kaf

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-ho

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-ho

Error deserializing Number type column without precision nor scala - Oracle about abris HOT 7 CLOSED

joserivera1990 commented on September 25, 2024

Error deserializing Number type column without precision nor scala - Oracle

from abris.

Comments (7)

cerveada commented on September 25, 2024

Hello,

Could you print d2.schema() and post it here?

What version of org.apache.spark:spark-avro library is on your classpath?

from abris.

kevinwallimann commented on September 25, 2024

Hi @joserivera1990 Unfortunately, your avro schema is invalid. Per the avro spec, "Scale must be [...] less than or equal to the precision.", see https://avro.apache.org/docs/current/spec.html#Decimal, which is not the case (127 > 64).
What happens is that the logical type (Decimal) is not validated when the schema is parsed and instead falls back to null. So, Spark doesn't see the avro logical type decimal and interprets it as a BinaryType instead of a DecimalType.

To solve the issue, you could

fix the schema generation process
or change the avro schema on schema registry
or provide the fixed reader schema to Abris using the .provideReaderSchema config method (see https://github.com/AbsaOSS/ABRiS/blob/master/documentation/confluent-avro-documentation.md#avro-to-spark for examples)

from abris.

joserivera1990 commented on September 25, 2024

Hello,

Could you print d2.schema() and post it here?

What version of org.apache.spark:spark-avro library is on your classpath?

Hi @cerveada,

I added the print to d2.schema()
StructType(StructField(STFAMPRO,StringType,true), StructField(CHFAMPRO,StringType,true), StructField(TEST_NUMBER,BinaryType,true), StructField(TEST_NUMBER_DECIMAL,BinaryType,true), StructField(table,StringType,true), StructField(SCN_CMD,StringType,true), StructField(OP_TYPE_CMD,StringType,true), StructField(op_ts,StringType,true), StructField(current_ts,StringType,true), StructField(row_id,StringType,true), StructField(username,StringType,true))

Checking the external libraries I have -> org.apache.spark:spark-avro_2.12:2.4.8

Regards.

from abris.

joserivera1990 commented on September 25, 2024

Hi @joserivera1990 Unfortunately, your avro schema is invalid. Per the avro spec, "Scale must be [...] less than or equal to the precision.", see https://avro.apache.org/docs/current/spec.html#Decimal, which is not the case (127 > 64). What happens is that the logical type (Decimal) is not validated when the schema is parsed and instead falls back to null. So, Spark doesn't see the avro logical type decimal and interprets it as a BinaryType instead of a DecimalType.

To solve the issue, you could

fix the schema generation process

or change the avro schema on schema registry

or provide the fixed reader schema to Abris using the .provideReaderSchema config method (see https://github.com/AbsaOSS/ABRiS/blob/master/documentation/confluent-avro-documentation.md#avro-to-spark for examples)

Hi @kevinwallimann,

I got your points, about changing the schema generation process, this schema is generated by Confluent on the connector CDC Oracle
https://docs.confluent.io/kafka-connect-oracle-cdc/current/troubleshooting.html#numeric-data-type-with-no-precision-or-scale-results-in-unreadable-output

I did the test using the function .provideReaderSchema and setting in the schema the values: precicion:38 and scale:10

{\"name\":\"TEST_NUMBER\",\"type\":[\"null\",{\"type\":\"bytes\",\"scale\":10,\"precision\":38,\"connect.version\":1,\"connect.parameters\":{\"scale\":\"10\"},\"connect.name\":\"org.apache.kafka.connect.data.Decimal\",\"logicalType\":\"decimal\"}],\"default\":null}

And throw the next error: Decimal precision 128 exceeds max precision 38

Finally, I think that should have some way to get the number with scale of 127 and precicion of 64, I don't know if as a string in place of decimal. Por example, I'm using the connector com.snowflake.kafka.connector.SnowflakeSinkConnector and in the properties in value.converter io.confluent.connect.avro.AvroConverter In the Snowflake DataBase that row is saved in this way:

{
  "STFAMPRO": "AA",
  "TEST_NUMBER": "5.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
  "TEST_NUMBER_DECIMAL": "5.1500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
  "current_ts": "1651059748133",
}

The problem is than I don't know as Snowflake implements this connector com.snowflake.kafka.connector.SnowflakeSinkConnector

Thanks for your time!

from abris.

kevinwallimann commented on September 25, 2024

Hi @joserivera1990 I see, the problem now is that your data has decimal precision 128 which is larger than the maximum that Spark supports (38). In this case, Spark uses the BinaryType as a fallback. You could try to convert the BinaryType to a human-readable format after it's already in a Spark Dataframe. I think this problem should be solved outside of Abris.
Just for the sake of completeness, there is a way to have your own custom logic to convert from Avro to a Spark Dataframe in Abris, see https://github.com/AbsaOSS/ABRiS#custom-data-conversions. However, it's quite involved and I wouldn't recommend you that approach.

from abris.

joserivera1990 commented on September 25, 2024

Hi @joserivera1990 I see, the problem now is that your data has decimal precision 128 which is larger than the maximum that Spark supports (38). In this case, Spark uses the BinaryType as a fallback. You could try to convert the BinaryType to a human-readable format after it's already in a Spark Dataframe. I think this problem should be solved outside of Abris. Just for the sake of completeness, there is a way to have your own custom logic to convert from Avro to a Spark Dataframe in Abris, see https://github.com/AbsaOSS/ABRiS#custom-data-conversions. However, it's quite involved and I wouldn't recommend you that approach.

Hi @kevinwallimann , I will review your advices. Thanks for your time!

from abris.

joserivera1990 commented on September 25, 2024

Hi everyone,
This issue was solved from Connector CDC Oracle version 2.0.0 adding the property numeric.mapping with value best_fit_or_decimal.

https://docs.confluent.io/kafka-connect-oracle-cdc/current/configuration-properties.html
Explication Conector Confluent:
"Use best_fit_or_decimal if NUMERIC columns should be cast to Connect’s primitive type based upon the column’s precision and scale. If the precision and scale exceed the bounds for any primitive type, Connect’s DECIMAL logical type will be used instead."

In this way when the column in Oracle is Numeric without precision or scale the schema registry added the field as double. The connector just will use decimalType if the number value is major than double type maximun number.

My new schema registry version is:

{"type":"record","name":"ConnectDefault","namespace":"io.confluent.connect.avro","fields":[{"name":"STFAMPRO","type":["null","string"],"default":null},{"name":"CHFAMPRO","type":["null","string"],"default":null},{"name":"**TEST_NUMBER**","type":["null","**double**"],"default":null},{"name":"**TEST_NUMBER_DECIMAL**","type":["null","**double**"],"default":null},{"name":"table","type":["null","string"],"default":null},{"name":"SCN_CMD","type":["null","string"],"default":null},{"name":"OP_TYPE_CMD","type":["null","string"],"default":null},{"name":"op_ts","type":["null","string"],"default":null},{"name":"current_ts","type":["null","string"],"default":null},{"name":"row_id","type":["null","string"],"default":null},{"name":"username","type":["null","string"],"default":null}]}
I will close this issue. Thanks everyone!

from abris.

Error deserializing Number type column without precision nor scala - Oracle about abris HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs