Comments (4)
I see a similar question in #307 but it seems to have been closed without an answer
from avsc.
This works with to/fromBuffer, but fails when I use BlockEncoder - will submit a test case later. It seems I need to override _resolve() but how do I implement it ?
I see a lot of test cases using to/fromBuffer but none using BlockEncoder with logical types.
It would be nice if I could just :
- override StringType and add attributes, but only LogicalType supports attributes
OR - just provide the exact json schema I want and have the library write it verbatim, perhaps with a rawSchema flag.
class GoogleJson extends LogicalType {
_export(attrs) {
attrs.sqlType = 'JSON'
};
_toValue(input) {
return JSON.stringify(input);
}
_fromValue(input) {
return JSON.parse(input);
}
}
const schema = {
name: 'Thing',
type: 'record',
fields: [
{name: 'amount', type: 'int'},
{name: 'calc', type: {type: 'string', logicalType: 'google-json'}}
]
};
const thingAvroType = Type.forSchema(
//@ts-ignore
schema,
{logicalTypes: {'google-json': GoogleJson}}
);
describe('GoogleJson', () => {
it('buffer', async () => {
const thing = {
amount: 32,
calc: {a: 1, b: 2}
};
const buf = thingAvroType.toBuffer(thing);
const thing2 = thingAvroType.fromBuffer(buf);
expect(thing2).toMatchObject(thing);
});
});
from avsc.
Hi @buzzware. The best option with decorated schemas is typically to keep a copy alongside the generated type and reference it directly when the custom attributes are needed. However this doesn't work well with BlockEncoder
s which expect a single schema or type argument currently. I think it would be worth extending the BlockEncoder
API to better support this, for example via an additional schema
option. In the meantime here are a couple workarounds:
- If you don't need any type options when writing records, you can pass in the raw schema directly to the
BlockEncoder
which will then be written as-is. - If you do need type options, it's a bit trickier but you can use two encoders where the first will only write the header. Something like:
async function pipedBlockEncoder(type, schema, writable) {
const syncMarker = crypto.randomBytes(16);
// Header-only encoder (note the schema argument)
const prelude = new BlockEncoder(schema, {writeHeader: true, syncMarker});
prelude.end();
await pipeline(prelude, writable, {end: false}); // from node:stream/promises
// Data encoder (we pass in the type here, not the schema)
const content = new BlockEncoder(type, {writeHeader: false, syncMarker});
content.pipe(writable);
return content;
}
from avsc.
Thanks @mtth, I just got it working for the first time with that, including import to BigQuery with an auto-generated JSON column.
import {createFileDecoder, createFileEncoder, Schema, schema, streams, Type, types,} from "avsc";
import fs = require("fs");
import path = require("path");
import BlockEncoder = streams.BlockEncoder;
import {randomBytes} from "crypto";
const { finished, pipeline } = require('node:stream/promises');
describe('GoogleJson', () => {
async function pipedBlockEncoder(type, schema, writable) {
const syncMarker = randomBytes(16);
// Header-only encoder (note the schema argument)
const prelude = new BlockEncoder(schema, {writeHeader: true, syncMarker});
prelude.end();
await pipeline(prelude, writable, {end: false}); // from node:stream/promises
// Data encoder (we pass in the type here, not the schema)
const content = new BlockEncoder(type, {writeHeader: false, syncMarker});
content.pipe(writable);
return content;
}
it('mtth file example', async () => {
const thing = {
amount: 32,
calc: JSON.stringify({a: 1, b: 2})
};
const testFile = '/Users/gary/Downloads/avro_test.avro';
fs.rmSync(testFile,{force: true});
const schema: Schema = {
name: 'Thing',
type: 'record',
fields: [
{name: 'amount', type: 'int'},
{name: 'calc', type: {type: 'string', sqlType: 'JSON'}}
]
};
const type = Type.forSchema(schema);
let writeable = fs.createWriteStream(testFile, {encoding: 'binary'});
let encoder = await pipedBlockEncoder(type,schema,writeable)
encoder.write(thing);
encoder.write(thing);
encoder.write(thing);
encoder.end();
await finished(writeable);
console.log('end');
});
})
from avsc.
Related Issues (20)
- [Feature request] Provide utility to convert to and from JSON Schema HOT 1
- Off-by-one errors in Tap bounds checks? HOT 2
- Generate documentation and typings from JSDoc? HOT 1
- What's the dist folder used for? HOT 1
- discussion/question: createFileDecoder never gets any data HOT 8
- Support schema evolution without the need of previous schemas HOT 1
- float value is inaccurate after serializing and deserializing using avro schema HOT 1
- finish event fires too early HOT 3
- Array with null items supported? HOT 1
- Avro union - remove type information in resulting json HOT 1
- Bun support HOT 1
- Update `snappy` examples in wiki for `snappy` 7.x.x (async) HOT 1
- How to convert decoded avro data into JSON? HOT 1
- IDL not exporting types for array of union HOT 3
- Support ?-syntax for optional fields in avdl HOT 2
- can schema support dynamic keys? HOT 1
- Extending a schema causes a "truncated buffer" error when using fromBuffer HOT 5
- Invalid Avro header does not raise error event HOT 1
- "new SlowBuffer" is deprecated since Node v.6 --> cannot use it with VITE5 and VUE3 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from avsc.