avro-kotlin / avro4k Goto Github PK
View Code? Open in Web Editor NEWAvro format support for Kotlin
License: Apache License 2.0
Avro format support for Kotlin
License: Apache License 2.0
For example:
fun main() {
Avro.default.encodeToByteArray(ReferencingSealedClass.serializer(), ReferencingSealedClass(Operation.Nullary))
}
@Serializable
sealed class Operation {
@Serializable
object Nullary : Operation()
@Serializable
sealed class Unary() : Operation(){
abstract val value : Int
@Serializable
data class Negate(override val value:Int) : Unary()
}
@Serializable
sealed class Binary() : Operation(){
abstract val left : Int
abstract val right : Int
@Serializable
data class Add(override val left : Int, override val right : Int) : Binary()
@Serializable
data class Substract(override val left : Int, override val right : Int) : Binary()
}
}
@Serializable
data class ReferencingSealedClass(
val notNullable: Operation
)
While ser/de works, the ByteArray returned by encodeToByteArray
seems no deterministic.
For example:
@Serializable
data class TaskOptions(
val runningTimeout: Float? = null
)
fun main() {
val m = TaskOptions(2F)
val b1 = Avro.default.encodeToByteArray(TaskOptions.serializer(), m)
val b2 = Avro.default.encodeToByteArray(TaskOptions.serializer(), m)
println(b1.contentEquals(b2))
}
will print "false".
Besides being strange, it's an issue for some unit tests
Hi,
I currently have a use case with nested subtypes like the following:
interface Vehicle
@Serializable
data class Car(val color: String) : Vehicle
@Serializable
data class Person(val name: String, val vehicle: Vehicle)
val personRecord = Avro.default.toRecord(Person.serializer(), Person("Bob", Car("blue"))
Currently this throws
Exception in thread "main" kotlinx.serialization.SerializationException: Unsupported type kotlin.Any of POLYMORPHIC
Is serialization for something like this is even possible?
If so, are you planning on supporting Polymorphic types in future releases?
If not, do you know of any workarounds for this?
Thanks,
Neo
I think it would be nice to have a way of generating the data classes from existing avsc schema files. I'm assuming that's possible.
Consider the following class definition:
@Serializable
@AvroDoc("Playing cards")
@AvroAliases(["MySuit"])
enum class Suit {
SPADES, HEARTS, DIAMONDS, CLUBS;
}
@Serializable
@AvroDoc("simple record doc")
@AvroAlias("Alias")
data class SimpleRecord (
val value : Suit
)
The generated schema looks like this:
{
"type" : "record",
"name" : "SimpleRecord",
"doc" : "simple record doc",
"fields" : [ {
"name" : "value",
"type" : {
"type" : "enum",
"name" : "Suit",
"symbols" : [ "SPADES", "HEARTS", "DIAMONDS", "CLUBS" ]
}
} ],
"aliases" : [ "Alias" ]
}
The enum type does not have the defined aliases and doc. It should look like this:
{
"type" : "record",
"name" : "SimpleRecord",
"doc" : "simple record doc",
"fields" : [ {
"name" : "value",
"type" : {
"type" : "enum",
"name" : "Suit",
"doc" : "Playing cards",
"symbols" : [ "SPADES", "HEARTS", "DIAMONDS", "CLUBS" ],
"aliases" : [ "MySuit" ]
}
}],
"aliases" : [ "Alias" ]
}
Hi!
I have just encountered an issue when serialising a data class with a ByteArray
field using avro4k together with kafka-avro-serializer.
The data class I tried to serialize looks like this:
@Serializable
data class File(val content: ByteArray)
Here's how I'm serializing it:
Avro.default.toRecord(serializer, data).let { record ->
return kafkaSerializer.serialize(topic, record)
}
The problem is that in the serialized record the array in the File#content
field is always empty.
I traced back the problem to the way the ByteBuffer
instances are created in ByteArrayEncoder
.
The created buffers always have the current position set to the index of the last character i.e. position == limit
.
When serializer attempts to read from such buffers, they appear empty as ByteBuffer#hasRemaining
will always return false
.
This in turn results in empty arrays in serialised records.
I think this is a bug as byte buffers created by this library should be ready to be read by consumers of the records.
I'd like to serialize the same data classes to both avro and to json using @ContextualSerialization annotation. Unfortunately, currently avro4k does not generate the correct schema.
Example for repro:
@Serializable
data class Contextual(
@ContextualSerialization val time: Instant
)
@Serializable
data class SerializableWith(
@Serializable(with = InstantSerializer::class) val time: Instant
)
fun main() {
println(Avro(serializersModule(InstantSerializer())).schema(Contextual.serializer()))
println(Avro(serializersModule(InstantSerializer())).schema(SerializableWith.serializer()))
}
The output is:
{"type":"record","name":"Contextual","fields":[{"name":"time","type":{"type":"record","name":"CONTEXT","fields":[]}}]}
{"type":"record","name":"SerializableWith","fields":[{"name":"time","type":{"type":"long","logicalType":"timestamp-millis"}}]}
I would expect the first line to be the same as the second one.
I tried to use this code to encode to a stream:
val serializer = Foo.serializer()
Avro.default.openOutputStream(serializer) {
encodeFormat = AvroEncodeFormat.Binary
}.to(outStream).write(value).flush()
and this code to decode:
val serializer = Foo.serializer()
Avro.default.openInputStream(serializer) {
decodeFormat = AvroDecodeFormat.Binary(Avro.default.schema(serializer))
}.from(inStream).next()
The underlying class that is being encoded is:
@Serializable
data class Foo(
@Serializable(with = LocalDateSerializer::class)
val d: LocalDate,
val f: Float,
val s: String
)
However, this appears to read more bytes than it writes or something, because the underlying system in which I'm using these classes throws an EOFException
.
If I use the kotlinx-serialization default Json
encoder and my own DataOutputStream
and DataInputStream
wrappers to writeUTF
and readUTF
(and a custom implementation of LocalDateSerializer
), everything works fine.
I admit I haven't done a lot more investigation into what I could potentially be doing wrong, because honestly I don't have time to do it, but thought I would throw this out there. Any ideas?
It's not clear where I'm going wrong, but I'm trying to use Kafka, the Kafka Schema Registry, and Avro4k together.
I'm seeing the exception below with the code I've provided further down as a kotest
test case.
I'm probably doing something wrong here. Any tips would be much appreciated ๐
Error deserializing key/value for partition sample-event-topic-0 at offset 0. If needed, please seek past the record to continue consumption.
org.apache.kafka.common.errors.SerializationException: Error deserializing key/value for partition sample-event-topic-0 at offset 0. If needed, please seek past the record to continue consumption.
(Coroutine boundary)
at io.kotest.core.runtime.TestExecutor$executeAndWait$2$1$3.invokeSuspend(TestExecutor.kt:176)
at io.kotest.core.runtime.ReplayKt.replay(replay.kt:19)
at io.kotest.core.runtime.TestExecutor$executeAndWait$2$1.invokeSuspend(TestExecutor.kt:175)
at io.kotest.core.runtime.TestExecutor.executeAndWait-xkB6VbI(TestExecutor.kt:168)
at io.kotest.core.runtime.TestExecutor.invokeTestCase(TestExecutor.kt:150)
at io.kotest.core.runtime.TestExecutor.executeActiveTest(TestExecutor.kt:117)
at io.kotest.core.runtime.TestExecutor$intercept$2.invokeSuspend(TestExecutor.kt:74)
at io.kotest.core.runtime.TestExecutor.executeIfActive(TestExecutor.kt:86)
at io.kotest.core.runtime.TestExecutor.intercept(TestExecutor.kt:74)
at io.kotest.core.runtime.TestExecutor.execute(TestExecutor.kt:55)
at io.kotest.core.engine.SingleInstanceSpecRunner.runTest(SingleInstanceSpecRunner.kt:62)
at io.kotest.core.engine.SingleInstanceSpecRunner$execute$2.invokeSuspend(SingleInstanceSpecRunner.kt:73)
at io.kotest.core.engine.SingleInstanceSpecRunner$execute$3.invokeSuspend(SingleInstanceSpecRunner.kt:79)
at io.kotest.core.engine.SpecExecutor$runTests$run$1.invokeSuspend(SpecExecutor.kt:105)
at io.kotest.core.engine.SpecExecutor.runTests(SpecExecutor.kt:108)
at io.kotest.core.engine.SpecExecutor.execute(SpecExecutor.kt:36)
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing key/value for partition sample-event-topic-0 at offset 0. If needed, please seek past the record to continue consumption.
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id 9
Caused by: org.apache.kafka.common.errors.SerializationException: SampleEvent specified by the writers schema could not be instantiated to find the readers schema.
class SampleAvro4kProducerConsumerExceptionSpec : StringSpec({
val bootstrapServers = "kafka:9092"
val schemaRegistryUrl = "kafka-schema-registry:7081"
val topic = "sample-event-topic"
"consuming from an Avro4k producer causes an exception" {
val consumer = SampleEventConsumer(bootstrapServers, schemaRegistryUrl, topic)
val producer = SampleEventProducer(bootstrapServers, schemaRegistryUrl, topic)
val uuid = UUID.randomUUID().toString()
val metadata = producer.publish(uuid, SampleEvent("foo-$uuid", "bar-$uuid")).get()
println("Published event with offset ${metadata.offset()}")
// TODO: This should not throw and should consume messages.
shouldThrow<SerializationException> {
consumer.poll()
}.message shouldContain("Error deserializing key/value for partition")
}
})
class SampleEventConsumer(
bootstrapServers: String,
schemaRegistryUrl: String,
topic: String
) : AutoCloseable {
private val consumerConfig = mapOf(
ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG to bootstrapServers,
ConsumerConfig.GROUP_ID_CONFIG to UUID.randomUUID().toString(),
ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG to "true",
ConsumerConfig.AUTO_OFFSET_RESET_CONFIG to "earliest",
ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG to StringDeserializer::class.java,
ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG to SampleEvent.Deserializer::class.java,
KafkaAvroDeserializerConfig.SCHEMA_REGISTRY_URL_CONFIG to schemaRegistryUrl,
KafkaAvroDeserializerConfig.SPECIFIC_AVRO_READER_CONFIG to "true"
)
private val consumer = KafkaConsumer<String, SampleEvent>(consumerConfig).apply {
subscribe(listOf(topic))
}
fun poll(): ConsumerRecords<String, SampleEvent> = consumer.poll(Duration.ofSeconds(1))
override fun close() = consumer.close()
}
class SampleEventProducer(
bootstrapServers: String,
schemaRegistryUrl: String,
private val topic: String
) : AutoCloseable {
private val producerConfig = mapOf(
ProducerConfig.BOOTSTRAP_SERVERS_CONFIG to bootstrapServers,
ProducerConfig.ACKS_CONFIG to "all",
ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG to StringSerializer::class.java,
ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG to SampleEvent.Serializer::class.java,
KafkaAvroSerializerConfig.SCHEMA_REGISTRY_URL_CONFIG to schemaRegistryUrl
)
private val producer = KafkaProducer<String, SampleEvent>(producerConfig)
fun publish(key: String, value: SampleEvent): Future<RecordMetadata> =
producer.send(ProducerRecord(topic, key, value))
override fun close() = producer.close()
}
@Serializable
data class SampleEvent(
val foo: String,
val bar: String
) {
class Serializer : KafkaSerializer<SampleEvent>(serializer())
class Deserializer : KafkaDeserializer<SampleEvent>(serializer())
}
open class KafkaSerializer<T> private constructor(
private val serializer: SerializationStrategy<T>,
private val kafkaSerializer: KafkaAvroSerializer
) : Serializer<T> {
constructor(serializer: SerializationStrategy<T>) : this(serializer, KafkaAvroSerializer())
override fun serialize(topic: String?, data: T): ByteArray =
Avro.default.toRecord(serializer, data).let { kafkaSerializer.serialize(topic, it) }
override fun configure(configs: MutableMap<String, *>?, isKey: Boolean) = kafkaSerializer.configure(configs, isKey)
override fun close() = kafkaSerializer.close()
}
open class KafkaDeserializer<T> private constructor(
private val deserializer: DeserializationStrategy<T>,
private val kafkaDeserializer: KafkaAvroDeserializer
) : Deserializer<T> {
constructor(serializer: DeserializationStrategy<T>) : this(serializer, KafkaAvroDeserializer())
override fun deserialize(topic: String?, data: ByteArray): T =
(kafkaDeserializer.deserialize(topic, data) as GenericRecord).let { record ->
Avro.default.fromRecord(deserializer, record)
}
override fun configure(configs: MutableMap<String, *>?, isKey: Boolean) = kafkaDeserializer.configure(configs, isKey)
override fun close() = kafkaDeserializer.close()
}
The new kotlin version 1.3.70 introduced a breaking change with kotlinx.serialization 0.14.0 (reference https://github.com/Kotlin/kotlinx.serialization/blob/master/CHANGELOG.md#0200--2020-03-04).
avro4k needs to migrate to kotlinx.serialization 0.20.0 in order to support kotlin >= 1.3.70.
Hi
Upon using @AvroDefault annotation, I noticed that there is no option to set a default empty list value.
when (fieldDescriptor.kind) {
PrimitiveKind.INT -> it.toInt()
PrimitiveKind.LONG -> it.toLong()
PrimitiveKind.FLOAT -> it.toFloat()
PrimitiveKind.BOOLEAN -> it.toBoolean()
PrimitiveKind.BYTE -> it.toByte()
PrimitiveKind.SHORT -> it.toShort()
PrimitiveKind.STRING -> it
else -> throw IllegalArgumentException("Cannot use a default value for type ${fieldDescriptor.kind}")
}
Could StructureKind.LIST be supported so that the default value could be an empty [] array.
Thank you
I just stumbled upon this library and it looks great! It would be really useful to understand how one might use this with Kafka. Specifically how to handle the lifecycle of changing schemas over time.
I can do this:
@Serializable
data class Foo(
@Contextual
val d: LocalDate
)
and create a JSON serializer like this:
Json {
serializersModule = SerializersModule {
contextual(LocalDateAsLongSerializable)
}
}
however if I try to do a similar same thing with Avro4k like this:
Avro(SerializersModule {
contextual(LocalDateSerializer())
})
at runtime I get the error:
Caused by: kotlinx.serialization.SerializationException: Unsupported type kotlinx.serialization.ContextualSerializer<LocalDate> of CONTEXTUAL
at com.github.avrokotlin.avro4k.schema.SchemaForKt.schemaFor(SchemaFor.kt:180)
at com.github.avrokotlin.avro4k.schema.ClassSchemaFor.buildField(ClassSchemaFor.kt:78)
at com.github.avrokotlin.avro4k.schema.ClassSchemaFor.dataClassSchema(ClassSchemaFor.kt:63)
at com.github.avrokotlin.avro4k.schema.ClassSchemaFor.schema(ClassSchemaFor.kt:43)
at com.github.avrokotlin.avro4k.Avro.schema(Avro.kt:266)
at com.github.avrokotlin.avro4k.Avro.schema(Avro.kt:269)
at com.github.avrokotlin.avro4k.AvroOutputStreamBuilder.to(Avro.kt:121)
Hi,
I'm trying to use the RecordEncoder for a data class with a nullable nested data class like so:
@Serializable
data class Town(val name: String, val population: Int)
@Serializable
data class Birthplace(val name: String, val town: Town?)
However, when I run this code:
val birthplaceSchema = Avro.default.schema(Birthplace.serializer())
val b = Birthplace("sammy", Town("Hardwick", 123))
val record = Avro.default.toRecord(Birthplace.serializer(), b)
I get the error:
Exception in thread "main" org.apache.avro.AvroRuntimeException: Not a record: ["null",{"type":"record","name":"Town","namespace":"com.sksamuel.avro4k","fields":[{"name":"name","type":"string"},{"name":"population","type":"int"}]}]
After some digging, I found that it came from RecordEncoder.fieldSchema()
. So I changed the function from:
override fun fieldSchema(): Schema = schema.fields[currentIndex].schema()
to:
override fun fieldSchema(): Schema = when (schema.fields[currentIndex].schema().type) {
Schema.Type.UNION -> schema.fields[currentIndex].schema().extractNonNull()
else -> schema.fields[currentIndex].schema()
}
Anyways, this change got rid of the error and I was able to properly convert the data class into a ListRecord.
I was just wondering if you would considering adding this change.
Hey @sksamuel, first of all, thanks for the great library! I really like the easy way of serialization.
I'm currently writing a maven plugin that does compile avro schema files to avro4k compatible kotlin code as requested by #19 .
I tried to support record aliases but it seems that the Defintion of "@AvroAlias" does not support repetitive annotations. As multiple aliases are a feature of avro, I think this is a bug in avro4k.
I am trying to serialize as follows:
@ImplicitReflectionSerializer
class ClientService {
suspend fun sendCommand(client: Client) {
creatCommand(
"create-client", client.id, CommandStatus.Open, client,
"localhost:9092", true, "http://localhost:8081",
Acks.All, Int.MAX_VALUE, 5, Compression.Snappy, 20, BatchSize.ThirtyTwo
)
}
private suspend fun creatCommand(
topicName: String,
id: UUID,
commandStatus: CommandStatus,
message: Client,
bootstrapServers: String,
idempotence: Boolean,
schemaUrl: String,
acks: Acks,
retries: Int,
requestPerConnection: Int,
compression: Compression,
linger: Int,
batchSize: BatchSize
) {
val producer = producer(
bootstrapServers,
idempotence,
schemaUrl,
acks,
retries,
requestPerConnection,
compression,
linger,
batchSize
)
Avro.default.schema(Command.serializer())
Avro.default.schema(Client.serializer())
Avro.default.schema(Phone.serializer())
val command = Command(id, commandStatus, message)
val commandAvro = Avro.default.toRecord(Command.serializer(), command)
val record = ProducerRecord<String, GenericRecord>(topicName, id.toString(), commandAvro)
coroutineScope { launch { producer.dispatch(record) } }
}
}
And I'm getting the fallowing error:
org.apache.kafka.common.errors.SerializationException: Error serializing Avro message
Caused by: java.lang.NullPointerException: null of string in field email of com.rjdesenvolvimento.accesses.orchestration.client.Client in field client of com.rjdesenvolvimento.accesses.orchestration.common.command.Command
The Types table lists Set<T>
as a supported type, but it does not work, (or I'm doing it wrong). The following example
import com.sksamuel.avro4k.Avro
import kotlinx.serialization.Serializable
@Serializable
data class S(val t: Set<Int>)
fun main() {
val r = Avro.default.toRecord(S.serializer(), S(setOf(1)))
val back = Avro.default.fromRecord(S.serializer(), r) // this line fails
}
fails with the following error:
Exception in thread "main" kotlinx.serialization.SerializationException: class com.sksamuel.avro4k.decoder.ListDecoder can't retrieve untyped values
at kotlinx.serialization.ElementValueDecoder.decodeValue(ElementWise.kt:106)
at kotlinx.serialization.ElementValueDecoder.decodeInt(ElementWise.kt:115)
at kotlinx.serialization.internal.IntSerializer.deserialize(Primitives.kt:66)
at kotlinx.serialization.internal.IntSerializer.deserialize(Primitives.kt:62)
at com.sksamuel.avro4k.decoder.ListDecoder.decodeSerializableValue(ListDecoder.kt:60)
at kotlinx.serialization.ElementValueDecoder.decodeSerializableElement(ElementWise.kt:142)
at kotlinx.serialization.internal.ListLikeSerializer.readElement(CollectionSerializers.kt:89)
at kotlinx.serialization.internal.ListLikeSerializer.readAll(CollectionSerializers.kt:85)
at kotlinx.serialization.internal.AbstractCollectionSerializer.patch(CollectionSerializers.kt:35)
at kotlinx.serialization.internal.AbstractCollectionSerializer.deserialize(CollectionSerializers.kt:49)
at kotlinx.serialization.Decoder$DefaultImpls.decodeSerializableValue(Coders.kt:113)
at kotlinx.serialization.ElementValueDecoder.decodeSerializableValue(ElementWise.kt:91)
at kotlinx.serialization.ElementValueDecoder.decodeSerializableElement(ElementWise.kt:142)
at S$$serializer.deserialize(set.kt)
at S$$serializer.deserialize(set.kt:5)
at kotlinx.serialization.Decoder$DefaultImpls.decodeSerializableValue(Coders.kt:113)
at kotlinx.serialization.ElementValueDecoder.decodeSerializableValue(ElementWise.kt:91)
at com.sksamuel.avro4k.Avro.fromRecord(Avro.kt:222)
at SetKt.main(set.kt:15)
at SetKt.main(set.kt)
Tested on current master.
kotlinx.serialization
is moving towards it's 1.0.0
release with the lately released RC.
Unfortunately there were some incompatible changes.
As Kotlin 1.4.x
requires kotlinx.serialization
in at least version 1.0.0-RC+
it would be nice to have avro4k compatible with this version.
Currently no tag is created during the release workflow. This should be intregrated to minimize maintanance effort.
Hello there,
I'm trying to update my codebase to Kotlin 1.3.70 and ran into an issue where the newer Kotlin version needs a newer kotlinx.serialization, which then needs a new avro4k.
I then noticed that there's already commits to upgrade avro4k, but no release with them. I would like to ask for a new release with these changes :)
As usual, thanks for avro4k. And stay safe!
for example:
Avro.default.encodeToByteArray(String.serializer(), "foo"))
triggers
Exception in thread "main" kotlinx.serialization.SerializationException: Non-serializable class kotlin.String is not supported by class com.sksamuel.avro4k.encoder.RootRecordEncoder encoder
at kotlinx.serialization.encoding.AbstractEncoder.encodeValue(AbstractEncoder.kt:36)
at kotlinx.serialization.encoding.AbstractEncoder.encodeString(AbstractEncoder.kt:50)
at kotlinx.serialization.internal.StringSerializer.serialize(Primitives.kt:139)
at kotlinx.serialization.internal.StringSerializer.serialize(Primitives.kt:137)
at kotlinx.serialization.encoding.Encoder$DefaultImpls.encodeSerializableValue(Encoding.kt:259)
at kotlinx.serialization.encoding.AbstractEncoder.encodeSerializableValue(AbstractEncoder.kt:18)
at com.sksamuel.avro4k.Avro.toRecord(Avro.kt:220)
at com.sksamuel.avro4k.Avro$openOutputStream$builder$1$1.invoke(Avro.kt:199)
at com.sksamuel.avro4k.Avro$openOutputStream$builder$1$1.invoke(Avro.kt:104)
at com.sksamuel.avro4k.io.AvroDataOutputStream.write(AvroDataOutputStream.kt:44)
at com.sksamuel.avro4k.Avro.encodeToByteArray(Avro.kt:179)
at io.infinitic.common.MainKt.main(main.kt:18)
Hi, I would like to use the default feature for enums that is described here : https://avro.apache.org/docs/current/spec.html#Enums.
I tried to use the AvroDefault annotation, giving it the String value of an enum value, but I get a NumberFormatException when I try that.
I'd like to use this avro feature so that schemas that use enums can be backward compatible. It seems the only way to achieve this is to give the field a default value that is one of the enum values in case a reader is using an older version that does not know about added values for that enum.
I'm pretty new to avro4k, so maybe avro4k supports this feature in a way I don't know about.
Here is what I tried:
@Serializable
enum class IngredientType { VEGGIE, MEAT, }
@Serializable
data class Ingredient(val name: String, @AvroDefault("MEAT") val type: IngredientType,
val sugar: Double, val fat: Double,)
@Serializable
data class Pizza(val name: String, val ingredients: List<Ingredient>, val vegetarian: Boolean,
val kcals: Int,)
fun main() {
val schema = Avro.default.schema(Pizza.serializer())
println(schema.toString(true))
}
This is what I get:
Exception in thread "main" java.lang.NumberFormatException: Character M is neither a decimal digit number, decimal point, nor "e" notation exponential mark.
at java.base/java.math.BigDecimal.<init>(BigDecimal.java:518)
at java.base/java.math.BigDecimal.<init>(BigDecimal.java:401)
at java.base/java.math.BigDecimal.<init>(BigDecimal.java:834)
at com.sksamuel.avro4k.schema.ClassSchemaFor.convertToAvroDefault(ClassSchemaFor.kt:135)
at com.sksamuel.avro4k.schema.ClassSchemaFor.buildField(ClassSchemaFor.kt:114)
at com.sksamuel.avro4k.schema.ClassSchemaFor.dataClassSchema(ClassSchemaFor.kt:55)
at com.sksamuel.avro4k.schema.ClassSchemaFor.schema(ClassSchemaFor.kt:43)
at com.sksamuel.avro4k.schema.ListSchemaFor.schema(SchemaFor.kt:88)
at com.sksamuel.avro4k.schema.ClassSchemaFor.buildField(ClassSchemaFor.kt:76)
at com.sksamuel.avro4k.schema.ClassSchemaFor.dataClassSchema(ClassSchemaFor.kt:55)
at com.sksamuel.avro4k.schema.ClassSchemaFor.schema(ClassSchemaFor.kt:43)
at com.sksamuel.avro4k.Avro.schema(Avro.kt:239)
at co.da.avro.DomainKt.main(Domain.kt:18)
at co.da.avro.DomainKt.main(Domain.kt)
If this feature is not supported in avro4k, do you plan to support it in the future?
The InstantSerializer
does not preserve the nanosecond portion of the value, leading to lossy round-trip serialization. I discovered this by accident when switching to a new computer, which apparently supports nanosecond-precision timestamps via Instant.now()
.
To reproduce:
@Serializable
data class Holder(
@Serializable(with = InstantSerializer::class)
val value: Instant
)
val input = Holder(Instant.now().plusNanos(177))
val serialized = Avro.default.encodeToByteArray(Holder.serializer(), input)
val deserialized = Avro.default.decodeFromByteArray(Holder.serializer(), serialized)
println(input)
println(deserialized)
prints
Holder(value=2020-09-12T23:01:16.606000177Z)
Holder(value=2020-09-12T23:01:16.606Z)
Is it expected that sealed classes can not be encoded/decoded directly, but instead need to be held in a wrapper class?
See this example:
@Serializable
sealed class Car {
@Serializable
data class Camry(val year: Int): Car()
@Serializable
data class Corolla(val year: Int, val color: String): Car()
}
@Serializable
data class CarWrapper(
val car: Car
)
val car = Car.Camry(2010)
val serializer = Car.serializer()
val record = Avro.default.toRecord(serializer, car)
val wrapper = CarWrapper(car)
val wrapperSerializer = CarWrapper.serializer()
val wrappedRecord = Avro.default.toRecord(wrapperSerializer, wrapper)
println(record.schema)
println(wrappedRecord.schema)
We will see from the println
at the end that record.schema
is the schema of the Car.Camry
data class, not of the Car
sealed class (expected a union schema). By contrast, the wrappedRecord.schema
shows the car
field as a union of the sealed class's children.
{"type":"record","name":"Camry","namespace":"test.Car","fields":[{"name":"year","type":"int"}]}
{"type":"record","name":"CarWrapper","namespace":"test","fields":[{"name":"car","type":[{"type":"record","name":"Camry","namespace":"test.Car","fields":[{"name":"year","type":"int"}]},{"type":"record","name":"Corolla","namespace":"test.Car","fields":[{"name":"year","type":"int"},{"name":"color","type":"string"}]}]}]}
From poking around the test cases a bit, it seems only a wrapped class is tested: https://github.com/sksamuel/avro4k/blob/master/avro4k-core/src/test/kotlin/com/sksamuel/avro4k/encoder/SealedClassEncoderTest.kt
Is this expected behavior?
I am trying to set the default for a big decimal field as follows:
@AvroName("myclass")
@Serializable
data class myclass(
val name: String,
@Serializable(BigDecimalSerializer::class) @AvroDefault("0") val decimal: BigDecimal? = BigDecimal.ONE
)
val schema = Avro.default.schema(myclass.serializer())
println(schema.toString())
and it is producing this error. What is the correct way to do this?
Exception in thread "main" org.apache.avro.AvroRuntimeException: Unknown datum class: class java.lang.Byte
at org.apache.avro.util.internal.JacksonUtils.toJson(JacksonUtils.java:87)
at org.apache.avro.util.internal.JacksonUtils.toJsonNode(JacksonUtils.java:48)
at org.apache.avro.Schema$Field.<init>(Schema.java:558)
at com.sksamuel.avro4k.schema.ClassSchemaFor.buildField(ClassSchemaFor.kt:103)
at com.sksamuel.avro4k.schema.ClassSchemaFor.dataClassSchema(ClassSchemaFor.kt:37)
at com.sksamuel.avro4k.schema.ClassSchemaFor.schema(ClassSchemaFor.kt:25)
at com.sksamuel.avro4k.Avro.schema(Avro.kt:218)
when will next avro4k be release with kotlinx.serialization
version 1.0.0-RC support.
I am blocked as I upgraded to kotlin 1.4.
is there a way I can pull the latest build?
I have a data class like so:
@Serializable
data class Town(val name: String = "MyTown", val population: Int)
When converting this data class to a schema and printing it:
val schema = Avro.default.schema(Town.serializer())
println(schema.toString(true))
the output is:
{
"type" : "record",
"name" : "Town",
"namespace" : "com.sksamuel.avro4k",
"fields" : [ {
"name" : "population",
"type" : "int"
} ]
}
with the Town.name
field missing
Implement support for Enum type specific "fallback defaults" that can be used to support schema evolution. If a reader does not recognize the enum value in the avro message, it will fallback to the enum default. This is not to be mistaken with the default value that can be specified for fields. See https://issues.apache.org/jira/browse/AVRO-1340 and https://avro.apache.org/docs/current/spec.html#Enums for more details.
I am using:
implementation("com.sksamuel.avro4k", "avro4k-core", "0.30.0")
I have this class:
@Serializable
data class AnalysisDto(
val id: String,
val name: String,
val organizationId: String,
val questionnaire: QuestionnaireDto,
@AvroDefault(Avro.NULL)
val source: String? = null,
@Serializable(with= StringDateTimeSerializer::class)
val createdAt: LocalDateTime,
@Serializable(with= StringDateTimeSerializer::class)
val updatedAt: LocalDateTime,
@AvroDefault(Avro.NULL)
val status: String? = null
)
When I run Avro.default.schema(AnalysisDto.serializer())
, I get:
{"type":"record","name":"AnalysisDto","namespace":"com.rjf.www","fields":[..removed a bunch of fields for clarity, {"name":"status","type":["null","string"]}]}
I would have thought because I was using the @AvroDefault(Avro.NULL)
annotation that there would be a default field on status. i.e. {"name":"status", "default": null, "type":["null","string"]}
I have not been able to find any issues in the source after debugging through it for quite a while.
Consider the following data class definition:
@Serializable
data class BarString(
val a: String,
@AvroDefault("hello")
val b: String,
@AvroDefault("null")
val nullableString : String?,
@AvroDefault("hello")
val c:String?
)
the correct schema would be according to avro specs:
{
"type": "record",
"name": "BarString",
"namespace": "com.sksamuel.avro4k.schema",
"fields": [
{
"name": "a",
"type": "string"
},
{
"name": "b",
"type": "string",
"default": "hello"
},
{
"name": "nullableString",
"type": ["null","string"],
"default" : null
},
{
"name": "c",
"type": ["string","null"],
"default": "hello"
}
]
}
Please take a look at the order within the union of the "nullableString" field. The avro null type is first. This is because default values of union types can only be an instance of the first type within the union.
Avro4k currently fails when generating the schema:
Invalid default for field nullableString: "null" not a ["null","string"]
org.apache.avro.AvroTypeException: Invalid default for field nullableString: "null" not a ["null","string"]
at org.apache.avro.Schema.validateDefault(Schema.java:1540)
at org.apache.avro.Schema.access$500(Schema.java:87)
at org.apache.avro.Schema$Field.(Schema.java:521)
at org.apache.avro.Schema$Field.(Schema.java:557)
at com.sksamuel.avro4k.schema.ClassSchemaFor.buildField(ClassSchemaFor.kt:99)
at com.sksamuel.avro4k.schema.ClassSchemaFor.dataClassSchema(ClassSchemaFor.kt:36)
at com.sksamuel.avro4k.schema.ClassSchemaFor.schema(ClassSchemaFor.kt:24)
at com.sksamuel.avro4k.Avro.schema(Avro.kt:231)
at com.sksamuel.avro4k.schema.AvroDefaultSchemaTest$1.invokeSuspend(AvroDefaultSchemaTest.kt:13)
In the latest release v1.2.0, code for AvroJsonOutputStream hard-coded the pretty
option to true
, as showing below:
@OptIn(ExperimentalSerializationApi::class)
class AvroJsonOutputStream<T>(output: OutputStream,
converter: (T) -> GenericRecord,
schema: Schema) : DefaultAvroOutputStream<T>(output, converter, schema) {
... ...
override val encoder: JsonEncoder =
EncoderFactory.get().jsonEncoder(schema, output, true) // here, set pretty option to true
}
It makes the JSON output not suitable for some data processing scenarios that requires no newLine
in a json string. However, since the pretty option is hard-coded to true, there's no way for user to change it.
Reasons for not using workable like string replace ops or other JSON libraries:
replace("\n", "").replace(" ", "")
is not safe to protect content in json fields.So, I suggest to open the pretty option and configurable, and make AvroEncodeFormat's Json object
becomes PrettyJson object
and CompactJson object
, as showing below
For AvroEncodeFormat
// Before
object Json : AvroEncodeFormat() {
override fun <T> createOutputStream(
output: OutputStream,
schema: Schema,
converter: (T) -> GenericRecord
) = AvroJsonOutputStream(output, converter, schema, true)
}
// After
object PrettyJson : AvroEncodeFormat() {
override fun <T> createOutputStream(
output: OutputStream,
schema: Schema,
converter: (T) -> GenericRecord
) = AvroJsonOutputStream(output, converter, schema, true)
}
object CompactJson : AvroEncodeFormat() {
override fun <T> createOutputStream(
output: OutputStream,
schema: Schema,
converter: (T) -> GenericRecord
) = AvroJsonOutputStream(output, converter, schema, false)
}
For AvroJsonOutputStream
// Before
@OptIn(ExperimentalSerializationApi::class)
class AvroJsonOutputStream<T>(output: OutputStream,
converter: (T) -> GenericRecord,
schema: Schema) : DefaultAvroOutputStream<T>(output, converter, schema) {
... ...
override val encoder: JsonEncoder =
EncoderFactory.get().jsonEncoder(schema, output, true) // here, set pretty option to true
}
// After
@OptIn(ExperimentalSerializationApi::class)
class AvroJsonOutputStream<T>(output: OutputStream,
converter: (T) -> GenericRecord,
schema: Schema, pretty: Boolean = false) : DefaultAvroOutputStream<T>(output, converter, schema) {
... ...
override val encoder: JsonEncoder =
EncoderFactory.get().jsonEncoder(schema, output, pretty) // here, set pretty option to true
}
The following self referencing data class leads to a stack overflow when trying to determine the schema for the associatedProduct field:
@Serializable data class Product(val id: String, val associatedProduct: Product?)
java.lang.StackOverflowError
at kotlin.text.StringsKt__StringsKt.findAnyOf$StringsKt__StringsKt(Strings.kt:897)
at kotlin.text.StringsKt__StringsKt.access$findAnyOf(Strings.kt:1)
at kotlin.text.StringsKt__StringsKt$rangesDelimitedBy$4.invoke(Strings.kt:1170)
at kotlin.text.StringsKt__StringsKt$rangesDelimitedBy$4.invoke(Strings.kt)
at kotlin.text.DelimitedRangesSequence$iterator$1.calcNext(Strings.kt:1098)
at kotlin.text.DelimitedRangesSequence$iterator$1.hasNext(Strings.kt:1127)
at kotlin.sequences.TransformingSequence$iterator$1.hasNext(Sequences.kt:176)
at kotlin.sequences.SequencesKt___SequencesKt.joinTo(_Sequences.kt:1882)
at kotlin.sequences.SequencesKt___SequencesKt.joinToString(_Sequences.kt:1904)
at kotlin.sequences.SequencesKt___SequencesKt.joinToString$default(_Sequences.kt:1903)
at kotlin.text.StringsKt__StringsJVMKt.replace(StringsJVM.kt:76)
at kotlin.text.StringsKt__StringsJVMKt.replace$default(StringsJVM.kt:75)
at com.sksamuel.avro4k.RecordNaming.(RecordNaming.kt:22)
at com.sksamuel.avro4k.RecordNaming$Companion.invoke(RecordNaming.kt:13)
at com.sksamuel.avro4k.schema.ClassSchemaFor.buildField(ClassSchemaFor.kt:50)
at com.sksamuel.avro4k.schema.ClassSchemaFor.dataClassSchema(ClassSchemaFor.kt:37)
at com.sksamuel.avro4k.schema.ClassSchemaFor.schema(ClassSchemaFor.kt:25)
at com.sksamuel.avro4k.schema.NullableSchemaFor.schema(SchemaFor.kt:124)
at com.sksamuel.avro4k.schema.ClassSchemaFor.buildField(ClassSchemaFor.kt:52)
at com.sksamuel.avro4k.schema.ClassSchemaFor.dataClassSchema(ClassSchemaFor.kt:37)
at com.sksamuel.avro4k.schema.ClassSchemaFor.schema(ClassSchemaFor.kt:25)
at com.sksamuel.avro4k.schema.NullableSchemaFor.schema(SchemaFor.kt:124)
at com.sksamuel.avro4k.schema.ClassSchemaFor.buildField(ClassSchemaFor.kt:52)
at com.sksamuel.avro4k.schema.ClassSchemaFor.dataClassSchema(ClassSchemaFor.kt:37)
at com.sksamuel.avro4k.schema.ClassSchemaFor.schema(ClassSchemaFor.kt:25)
at com.sksamuel.avro4k.schema.NullableSchemaFor.schema(SchemaFor.kt:124)
at com.sksamuel.avro4k.schema.ClassSchemaFor.buildField(ClassSchemaFor.kt:52)
at com.sksamuel.avro4k.schema.ClassSchemaFor.dataClassSchema(ClassSchemaFor.kt:37)
at com.sksamuel.avro4k.schema.ClassSchemaFor.schema(ClassSchemaFor.kt:25)
at com.sksamuel.avro4k.schema.NullableSchemaFor.schema(SchemaFor.kt:124)
at com.sksamuel.avro4k.schema.ClassSchemaFor.buildField(ClassSchemaFor.kt:52)
at com.sksamuel.avro4k.schema.ClassSchemaFor.dataClassSchema(ClassSchemaFor.kt:37)
at com.sksamuel.avro4k.schema.ClassSchemaFor.schema(ClassSchemaFor.kt:25)
at com.sksamuel.avro4k.schema.NullableSchemaFor.schema(SchemaFor.kt:124)
at com.sksamuel.avro4k.schema.ClassSchemaFor.buildField(ClassSchemaFor.kt:52)
Since there is concept of sealed class in Kotlin, it would be good to have union type similarly like in avro4s
for sealed trait
Hello there ๐
Thanks a lot for avro4k! I noticed that the latest two releases are missing from https://mvnrepository.com/artifact/com.sksamuel.avro4k/avro4k -- can I request that they be also published there?
Thanks, and rock on!
This class:
@Serializable
data class Test(
val meta: Map<String, ByteArray>
)
triggers this error when decoding:
Exception in thread "main" java.lang.ClassCastException: java.nio.HeapByteBuffer cannot be cast to org.apache.avro.generic.GenericArray
at com.sksamuel.avro4k.decoder.MapDecoder.beginStructure(MapDecoder.kt:100)
at kotlinx.serialization.internal.AbstractCollectionSerializer.merge(CollectionSerializers.kt:29)
at kotlinx.serialization.internal.PrimitiveArraySerializer.deserialize(CollectionSerializers.kt:179)
at kotlinx.serialization.encoding.Decoder$DefaultImpls.decodeSerializableValue(Decoding.kt:234)
at kotlinx.serialization.encoding.AbstractDecoder.decodeSerializableValue(AbstractDecoder.kt:17)
at kotlinx.serialization.encoding.AbstractDecoder.decodeSerializableValue(AbstractDecoder.kt:41)
at kotlinx.serialization.encoding.AbstractDecoder.decodeSerializableElement(AbstractDecoder.kt:63)
at kotlinx.serialization.encoding.CompositeDecoder$DefaultImpls.decodeSerializableElement$default(Decoding.kt:479)
at kotlinx.serialization.internal.MapLikeSerializer.readElement(CollectionSerializers.kt:111)
at kotlinx.serialization.internal.MapLikeSerializer.readElement(CollectionSerializers.kt:85)
at kotlinx.serialization.internal.AbstractCollectionSerializer.readElement$default(CollectionSerializers.kt:51)
at kotlinx.serialization.internal.AbstractCollectionSerializer.merge(CollectionSerializers.kt:36)
at kotlinx.serialization.internal.AbstractCollectionSerializer.deserialize(CollectionSerializers.kt:43)
at kotlinx.serialization.encoding.Decoder$DefaultImpls.decodeSerializableValue(Decoding.kt:234)
at kotlinx.serialization.encoding.AbstractDecoder.decodeSerializableValue(AbstractDecoder.kt:17)
at kotlinx.serialization.encoding.AbstractDecoder.decodeSerializableValue(AbstractDecoder.kt:41)
at kotlinx.serialization.encoding.AbstractDecoder.decodeSerializableElement(AbstractDecoder.kt:63)
at io.infinitic.common.Test$$serializer.deserialize(main.kt)
at io.infinitic.common.Test$$serializer.deserialize(main.kt:46)
at kotlinx.serialization.encoding.Decoder$DefaultImpls.decodeSerializableValue(Decoding.kt:234)
at kotlinx.serialization.encoding.AbstractDecoder.decodeSerializableValue(AbstractDecoder.kt:17)
at com.sksamuel.avro4k.Avro.fromRecord(Avro.kt:230)
at com.sksamuel.avro4k.Avro$openInputStream$builder$2.invoke(Avro.kt:165)
at com.sksamuel.avro4k.io.AvroDataInputStream.next(AvroDataInputStream.kt:27)
at com.sksamuel.avro4k.io.AvroInputStream$DefaultImpls.nextOrThrow(AvroInputStream.kt:37)
at com.sksamuel.avro4k.io.AvroDataInputStream.nextOrThrow(AvroDataInputStream.kt:9)
at com.sksamuel.avro4k.Avro.decodeFromByteArray(Avro.kt:127)
For example with
val o = Test(mapOf("foo" to "bar".toByteArray()))
val serializer = Test.serializer()
val byteArray = Avro.default.encodeToByteArray(serializer, o)
val o2 = Avro.default.decodeFromByteArray(serializer, byteArray)
While same Json ser/de works and despite a correct avro schema generated:|
{
"type" : "record",
"name" : "Test",
"namespace" : "me",
"fields" : [ {
"name" : "meta",
"type" : {
"type" : "map",
"values" : "bytes"
}
} ]
}
Though I call the BinaryFormat implementations on an instance of avro, it then uses the factory method data
to create the AvroOutputStream which internally falls back on Avro.default
.
So no matter how I configured my Avro instance with a non-default SerialModule and additional serializers, using dump I will fall back to the static default.
A possible fix would be to pass the avro instance to the Stream builder and put a default value Avro.default
on the parameter ... the Avro#dump implementation then could pass itself to the builder.
If you confirm this issue, I'll be glad to support this great project with a PR.
For Example:
@Serializable
data class User(
val child: User?
)
Can be serialized using Json:
val msg = User(User(null))
println(Json.encodeToString(msg)) // {"child":{"child":null}}
But failed to be serialized using Avro
Avro.default.encodeToByteArray(User.serializer(), msg)
throws a stackoverflow of numerous:
at com.sksamuel.avro4k.schema.ClassSchemaFor.buildField(ClassSchemaFor.kt:70)
at com.sksamuel.avro4k.schema.ClassSchemaFor.dataClassSchema(ClassSchemaFor.kt:55)
at com.sksamuel.avro4k.schema.ClassSchemaFor.schema(ClassSchemaFor.kt:43)
at com.sksamuel.avro4k.schema.NullableSchemaFor.schema(SchemaFor.kt:123)
First stable release of kotlinx.serialization
has been released. avro4k should be upgrade to it or probably none/minor changes
https://github.com/Kotlin/kotlinx.serialization/releases/tag/v1.0.0
EDIT: Oh sorry versions are upgrade already in this commit: b38560f by @thake . Well perhaps time for new release of avro4k ? EDIT v2 perhaps it was after this issue
P.S. I noticed that this Github repository is missing tags from latest releases. Or at least after project transferred to use GH actions. So probably missing from its definitions.
I have a schema that includes the below type:
{
"name" : "operation",
"type" : [ {
"type" : "record",
"name" : "OperationOnEntry",
"fields" : [ ... ]
}, {
"type" : "enum",
"name" : "Read",
"symbols" : [ "Read" ]
} ]
}
Is it possible to represent this type in Kotlin using Avro4k? I tried
@Serializable
sealed class Operation
@Serializable
data class OperationOnEntry( ... ) : Operation()
@Serializable
object Read : Operation()
However this produces
{
"name" : "operation",
"type" : [ {
"type" : "record",
"name" : "OperationOnEntry",
"fields" : [ ... ]
}, {
"type" : "record",
"name" : "Read",
"fields" : [ ]
} ]
}
Serializing maps produces too much bytes comparing to Protobuf and CBOR. Not sure, if it's something with avro itself or schema generation. I believe this is significant because the main and only reason to use binary formats is that they could be compact and safe regardless of the user content (no need to escape anything).
@Serializable
data class X(
val v: Int,
val map: Map<String, String>
)
@OptIn(ExperimentalSerializationApi::class)
fun main() {
val data = X(1, emptyMap())
ProtoBuf.encodeToByteArray(data).let { println("Protobuf: ${it.size}") }
Avro.default.encodeToByteArray(data).let { println("Avro: ${it.size}") }
Cbor.encodeToByteArray(data).let { println("Cbor: ${it.size}") }
}
Produces:
Protobuf: 2
Avro: 221
Cbor: 11
I want to generate a schema file. Given fields like the following
@AvroDoc("For coupons only. If the reward is a coupon, this will be the id")
val coupon_id:Int? = null,
@AvroDoc("For coupons only. The start date of the coupon.")
@Serializable(with= InstantSerializer::class)
val coupon_timestamp:Instant? = null,
@AvroDoc("For coupons only. The date when the coupon should first be displayed to the user.")
@Serializable(with= InstantSerializer::class)
val coupon_display_timestamp:Instant? = null,
I would like to add "default": null to each of these in the generated schema. Unfortunately, it appears that I can't do something like:
...
@Serializable(with= InstantSerializer::class)
@AvroDoc(value=null)
val coupon_display_timestamp:Instant? = null
Because the parser considers this to be invalid.
Is this an actual issue? Short of writing my own serializer, is there another way to accomplish the same thing?
Does avro4k support reading the default value and putting it into the Avro schema? Found an issue doing an experiment with schema evolution where newer version couldn't read older version
Version "0.1.0"
@Serializable
data class WebhookData(val firstName: String, val lastName: String, val age: Int? = null)
{
"type" : "record",
"name" : "WebhookData",
"namespace" : "com.shoprunner.data.webhook.model",
"fields" : [ {
"name" : "firstName",
"type" : "string"
}, {
"name" : "lastName",
"type" : "string"
}, {
"name" : "age",
"type" : [ "null", "int" ]
} ]
}
Version "0.2.0"
@Serializable
data class WebhookData(val firstName: String, val lastName: String, val age: Int? = null, val favoriteColor: String? = null)
{
"type" : "record",
"name" : "WebhookData",
"namespace" : "com.shoprunner.data.webhook.model",
"fields" : [ {
"name" : "firstName",
"type" : "string"
}, {
"name" : "lastName",
"type" : "string"
}, {
"name" : "age",
"type" : [ "null", "int" ]
}, {
"name" : "favoriteColor",
"type" : [ "null", "string" ]
} ]
}
But since default values aren't set, it can't read the older data.
How can i use null
default value? As
{ "name": "foobar", "type": ["null", "string"], "default": null }
I can't figure out how to use the types mentioned in that big "Types" table. For example, simple code
import com.sksamuel.avro4k.Avro
import kotlinx.serialization.Serializable
import java.sql.Timestamp
@Serializable
data class Example(
val time: Timestamp
)
fun main() {
println(Avro.default.schema(Example.serializer()).toString())
}
fails with an error
Serializer has not been found for type 'Timestamp'. To use context serializer as fallback, explicitly annotate type or property with @ContextualSerialization
If you can tell me how to make it work, I'll open a PR with doc improvement :)
I am trying avro4k
with Apache Beam and getting this error
java.io.NotSerializableException: micro.apps.model.Person$$serializer
full example here: https://github.com/xmlking/micro-apps/tree/develop/apps/streaming-pipeline
looking for advice if I am doing anything wrong!
Thanks
package micro.apps.pipeline
import com.google.common.collect.ImmutableMap
import com.sksamuel.avro4k.Avro
import java.io.Serializable
import kotlin.test.Test
import micro.apps.kbeam.map
import micro.apps.model.Person
import org.apache.avro.generic.GenericRecord
import org.apache.beam.sdk.coders.AvroCoder
import org.apache.beam.sdk.io.gcp.pubsub.PubsubIO
import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage
import org.apache.beam.sdk.io.gcp.pubsub.PubsubOptions
import org.apache.beam.sdk.testing.TestPipeline
import org.apache.beam.sdk.transforms.Create
import org.junit.Rule
class PubSubProducerTest : Serializable {
@Rule
@Transient
@JvmField
val pipeline = TestPipeline.create()
@Test
fun generateTestData() {
val options = TestPipeline.testingPipelineOptions()
options.`as`(PubsubOptions::class.java).pubsubRootUrl = "http://localhost:8085"
val serializer = Person.serializer()
val schema = Avro.default.schema(serializer)
// sample data
val records: List<GenericRecord> = listOf(
Avro.default.toRecord(serializer, Person(firstName = "sumo1", lastName = "demo1", email = "[email protected]", phone = "0000000000", age = 99)),
Avro.default.toRecord(serializer, Person(firstName = "sumo2", lastName = "demo1", email = "[email protected]", phone = "1111111111", age = 99, valid = true))
) // com.sksamuel.avro4k.ListRecord
val input = pipeline.apply(Create.of(records).withCoder(AvroCoder.of(schema)))
.map("Map Avro to Pubsub message") {
val attributes = ImmutableMap.builder<String, String>()
.put("timestamp", "")
.put("fingerprint", "fingerprint")
.put("uuid", "uuid")
.build() // Collections.emptyMap()
val genericPerson = Avro.default.fromRecord(serializer, it)
val bytes = Avro.default.dump(serializer, genericPerson)
PubsubMessage(bytes, attributes)
}
input.apply("Write Message to PubSub", PubsubIO.writeMessages().to("projects/my-project-id/topics/streaming-input"))
pipeline.run(options)
}
}
@Serializable
@AvroProp("mode", "private")
data class Person(
@AvroProp("pii", "yes") @ProtoId(1) val id: String = "",
@ProtoId(2) val firstName: String,
@ProtoId(3) val lastName: String,
@AvroProp("pii", "yes") @AvroFixed(10) @ProtoId(4) val phone: String,
@AvroProp("encrypted", "yes") @ProtoId(5) val email: String,
@AvroProp("pii", "yes") @ProtoId(6) @ProtoType(ProtoNumberType.SIGNED) val age: Int,
@Transient val valid: Boolean = false // not serialized: explicitly transient
)
unable to serialize DoFnWithExecutionInformation{doFn=micro.apps.pipeline.PubSubProducerTest$generateTestData$$inlined$map$1@7a404940, mainOutputTag=Tag<output>, sideInputMapping={}, schemaInformation=DoFnSchemaInformation{elementConverters=[]}}
java.lang.IllegalArgumentException: unable to serialize DoFnWithExecutionInformation{doFn=micro.apps.pipeline.PubSubProducerTest$generateTestData$$inlined$map$1@7a404940, mainOutputTag=Tag<output>, sideInputMapping={}, schemaInformation=DoFnSchemaInformation{elementConverters=[]}}
at org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:55)
at org.apache.beam.repackaged.direct_java.runners.core.construction.ParDoTranslation.translateDoFn(ParDoTranslation.java:654)
at org.apache.beam.repackaged.direct_java.runners.core.construction.ParDoTranslation$1.translateDoFn(ParDoTranslation.java:216)
at org.apache.beam.repackaged.direct_java.runners.core.construction.ParDoTranslation.payloadForParDoLike(ParDoTranslation.java:794)
at org.apache.beam.repackaged.direct_java.runners.core.construction.ParDoTranslation.translateParDo(ParDoTranslation.java:212)
at org.apache.beam.repackaged.direct_java.runners.core.construction.ParDoTranslation.translateParDo(ParDoTranslation.java:191)
at org.apache.beam.repackaged.direct_java.runners.core.construction.ParDoTranslation$ParDoTranslator.translate(ParDoTranslation.java:128)
at org.apache.beam.repackaged.direct_java.runners.core.construction.PTransformTranslation.toProto(PTransformTranslation.java:225)
at org.apache.beam.repackaged.direct_java.runners.core.construction.ParDoTranslation.getParDoPayload(ParDoTranslation.java:737)
at org.apache.beam.repackaged.direct_java.runners.core.construction.ParDoTranslation.isSplittable(ParDoTranslation.java:754)
at org.apache.beam.repackaged.direct_java.runners.core.construction.PTransformMatchers$6.matches(PTransformMatchers.java:266)
at org.apache.beam.sdk.Pipeline$2.visitPrimitiveTransform(Pipeline.java:284)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:665)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:317)
at org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:251)
at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:463)
at org.apache.beam.sdk.Pipeline.replace(Pipeline.java:262)
at org.apache.beam.sdk.Pipeline.replaceAll(Pipeline.java:212)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:170)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
at micro.apps.pipeline.PubSubProducerTest.generateTestData(PubSubProducerTest.kt:56)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
at com.sun.proxy.$Proxy5.processTestClass(Unknown Source)
at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:118)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:182)
at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:164)
at org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:412)
at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:48)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:56)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.io.NotSerializableException: micro.apps.model.Person$$serializer
at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1185)
at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1553)
at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)
at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)
at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1553)
at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)
at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)
at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
at java.base/java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:349)
at org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:51)
... 73 more
It looks like Avro4s has this https://github.com/sksamuel/avro4s#field-mapping
It would help us a lot if we could use it in Avro4k.
Hi @sksamuel ,
today I'm trying to upgrade my project kotlin version to 1.4 and avro4k is getting a error on intialization.
I have been doing some research, between kotlinx.serialization and avro4k, but I still haven't fully understood the error. Can you help me?
Basically when the Avro class will by initialized, a crash happens:
val user = Avro.default.fromRecord(UserMessageDto.serializer(), record)
and the following error
Exception in thread "consumerContext" java.lang.NoClassDefFoundError: Could not initialize class com.sksamuel.avro4k.Avro
at com.quick.tor.kafka.consumer.UserRegisterKafkaConsumerKt$consumerInsertUser$2.invokeSuspend(UserRegisterKafkaConsumer.kt:34)
at com.quick.tor.kafka.consumer.UserRegisterKafkaConsumerKt$consumerInsertUser$2.invoke(UserRegisterKafkaConsumer.kt)
at com.quick.tor.infrastructure.consumer.KafkaConsumerKt.clientConsumer(KafkaConsumer.kt:31)
at com.quick.tor.infrastructure.consumer.KafkaConsumerKt.clientConsumer$default(KafkaConsumer.kt:21)
at com.quick.tor.kafka.consumer.UserRegisterKafkaConsumerKt.consumerInsertUser(UserRegisterKafkaConsumer.kt:18)
at com.quick.tor.kafka.KafkaConsumerModuleKt$installKafkaConsumers$1.invokeSuspend(KafkaConsumerModule.kt:19)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:56)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:830)
I've tried create Avro class without the Avro.default
(for debug) because its getting the same error.
// exactly the same Avro.default code in companion object
val simpleModule = SerializersModule {
mapOf(
UUID::class to UUIDSerializer()
)
}
Avro(simpleModule)
my dependencies versions
ktor_version = '1.4.1'
kotlin_version = '1.4.0'
kotlinx_serialization_version = '1.4.10'
serialization_version = '1.0-M1-1.4.0-rc'
avro4k_version = '0.40.0.12-SNAPSHOT'
gson_version = '2.8.6'
// ...
// infrastructure
kafka_clients = '2.6.0'
kafka_avro_serializer = '5.3.0'
I honestly don't know exactly where else to look to understand the error.
Can you give me a hand?
There's anything that I can do to help?
It's working normally in version 1.3.72 of the kotlin and thanks for working with the library
The project implementation here.
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.