Merge mango types with Keith Turner's lexicoders. about mango HOT 9 CLOSED

cjnolet commented on August 22, 2024

Merge mango types with Keith Turner's lexicoders.

from mango.

Comments (9)

eawagner commented on August 22, 2024

I am for breaking up the way types are handled. For accumulo normalization, I think this is a great idea.

For things such as converting to strings for Json representation, we need another way. Currently the two are linked in the Types normalizers.

Unless these get broken up, I am not too comfortable in having an extra third party dependency simply to use the Types apis.

from mango.

cjnolet commented on August 22, 2024

Seems like one is for normalization and one is for lexicoding. I don't mind proposing a design change and breaking these up.

from mango.

eawagner commented on August 22, 2024

Ok looking at type normalization, we have 5 main functions on a TypeNormalizer.

resolves -> the class it handles
getAlias -> short name for the class
asString -> readable representation (used only for json)
fromString -> inverse of asString
normalize -> lexigraphically sortable representation (used only for accumulo)
denomalize -> inverse of normalize

It seems to me that if we come up with a more generic name for asString/normalize and fromString/denormalize, we can have a common types interface that can back normalizers, encoders, serializers.

Maybe something along the lines of
resolves -> the class it handles
getAlias -> short name for the class
encode -> some string representation
decode -> inverse of encode.

Then have implementations for each type specified for serialization and another set for accumulo using Lexicoders or current normalization (or both).

from mango.

cjnolet commented on August 22, 2024

+1 Per IM conversation with Edward Wagner:

We should probably use bytes for the lexicographically sortable encoders and use strings for the pretty-print encoders. Probably could use generics to type the implementation of the encoder to string or byte array.

from mango.

eawagner commented on August 22, 2024

Opened issue Issue #27, to discuss and track the changes to the types API.

from mango.

eawagner commented on August 22, 2024

One issue that I have found using the lexicoders is more of a recipes problem. We will need to be more careful with how we construct rows.

Currently we will do something like longval + "\u0000" + longval. Then to parse it we simply do a string split. This logic doesn't work with the lexicoders as an encoded long will have "\u0000" in the encoded value if any byte did not have any set bits.

This simply means that we need to be more careful parsing rows. For example extract 8 bytes for the first long then another 8 for the next.

Just something to keep in mind, but I still want to include this functionality at some point.

from mango.

cjnolet commented on August 22, 2024

Yeah, but at the same time, it would also be more efficient to pull split the strings using known indexes when possible rather than having an algorithm (like StringUtils) need to do a possible O(n) search through the string.

from mango.

eawagner commented on August 22, 2024

Now that these are actually included in accumulo, does this make sense to do anymore?

Especially in mango?

from mango.

cjnolet commented on August 22, 2024

Can we close this? I'm in agreement that I don't think we need it.

from mango.

Merge mango types with Keith Turner's lexicoders. about mango HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs