GithubHelp home page GithubHelp logo

Comments (7)

kkrugler avatar kkrugler commented on July 28, 2024

Hi Chris,

thanks - I'm hoping Vivek will have some time to merge this in.

BTW, I'd like to move the cascading.avro repo from the bixolabs account over to the Scale Unlimited organization, since it's the only project left in bixolabs (I've migrated everything else).

Would post-merge be a good time to do this?

Thanks,

-- Ken

On Jan 23, 2013, at 12:14am, Chris Severs wrote:

I updated the 2.1 branch with some bugfixes and changed the unpacked api to use a different scheme for clarity. I think this should be version 2.1.1 and we should put a new jar up on conjars.

Apologies for the diffs, new IDE and some cut/paste seems to have moved everything around.

You can merge this Pull Request by running

git pull https://github.com/bixolabs/cascading.avro 2.1-bugfix
Or view, comment on, or merge it at:

https://github.com/bixolabs/cascading.avro/pull/18

Commit Summary

Fixed bug where Avro was setting shuffle information when used for input only
Making versions match on the bug branch
Added much better comments, fixed the 'record' field hardcoding in the case of packUnpack being false
Fixed issue with not getting schema from a path
Backported some changes/bugfixes from the 2.2-wip
Fixed PackedAvroScheme to use Fields.FIRST for source/sink.
File Changes

M maven-plugin/pom.xml (2)
M scheme/pom.xml (2)
M scheme/src/main/java/cascading/avro/AvroScheme.java (564)
A scheme/src/main/java/cascading/avro/PackedAvroScheme.java (109)
M scheme/src/test/java/cascading/avro/AvroSchemeTest.java (1254)
A scheme/src/test/resources/cascading/avro/test6.avsc (9)
A scheme/src/test/resources/cascading/avro/words.txt (2)
Patch Links:

https://github.com/bixolabs/cascading.avro/pull/18.patch
https://github.com/bixolabs/cascading.avro/pull/18.diff


Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr

from cascading.avro.

kkrugler avatar kkrugler commented on July 28, 2024

No objections here.

I'm almost done with the full schema gen in 2.2 as well. Just need to put in some more custom coercible type wrappers for the avro types. I think this will work nicely for the normal fields based API as well since the wrappers for map and list allow you to specify the contents so the schema gen can do the right thing.

On Jan 23, 2013, at 10:21 AM, Ken Krugler <[email protected]mailto:[email protected]>
wrote:

Hi Chris,

thanks - I'm hoping Vivek will have some time to merge this in.

BTW, I'd like to move the cascading.avro repo from the bixolabs account over to the Scale Unlimited organization, since it's the only project left in bixolabs (I've migrated everything else).

Would post-merge be a good time to do this?

Thanks,

-- Ken

On Jan 23, 2013, at 12:14am, Chris Severs wrote:

I updated the 2.1 branch with some bugfixes and changed the unpacked api to use a different scheme for clarity. I think this should be version 2.1.1 and we should put a new jar up on conjars.

Apologies for the diffs, new IDE and some cut/paste seems to have moved everything around.

You can merge this Pull Request by running

git pull https://github.com/bixolabs/cascading.avro 2.1-bugfix
Or view, comment on, or merge it at:

https://github.com/bixolabs/cascading.avro/pull/18

Commit Summary

Fixed bug where Avro was setting shuffle information when used for input only
Making versions match on the bug branch
Added much better comments, fixed the 'record' field hardcoding in the case of packUnpack being false
Fixed issue with not getting schema from a path
Backported some changes/bugfixes from the 2.2-wip
Fixed PackedAvroScheme to use Fields.FIRST for source/sink.
File Changes

M maven-plugin/pom.xml (2)
M scheme/pom.xml (2)
M scheme/src/main/java/cascading/avro/AvroScheme.java (564)
A scheme/src/main/java/cascading/avro/PackedAvroScheme.java (109)
M scheme/src/test/java/cascading/avro/AvroSchemeTest.java (1254)
A scheme/src/test/resources/cascading/avro/test6.avsc (9)
A scheme/src/test/resources/cascading/avro/words.txt (2)
Patch Links:

https://github.com/bixolabs/cascading.avro/pull/18.patch
https://github.com/bixolabs/cascading.avro/pull/18.diff


Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr


Reply to this email directly or view it on GitHubhttps://github.com/bixolabs/cascading.avro/pull/18#issuecomment-12613902.

from cascading.avro.

kkrugler avatar kkrugler commented on July 28, 2024

On Jan 23, 2013, at 10:45am, Chris Severs wrote:

No objections here.

I'm almost done with the full schema gen in 2.2 as well. Just need to put in some more custom coercible type wrappers for the avro types. I think this will work nicely for the normal fields based API as well since the wrappers for map and list allow you to specify the contents so the schema gen can do the right thing.

How might this change with Cascading 2.2 support for type information in Fields?

-- Ken

On Jan 23, 2013, at 10:21 AM, Ken Krugler <[email protected]mailto:[email protected]>
wrote:

Hi Chris,

thanks - I'm hoping Vivek will have some time to merge this in.

BTW, I'd like to move the cascading.avro repo from the bixolabs account over to the Scale Unlimited organization, since it's the only project left in bixolabs (I've migrated everything else).

Would post-merge be a good time to do this?

Thanks,

-- Ken

On Jan 23, 2013, at 12:14am, Chris Severs wrote:

I updated the 2.1 branch with some bugfixes and changed the unpacked api to use a different scheme for clarity. I think this should be version 2.1.1 and we should put a new jar up on conjars.

Apologies for the diffs, new IDE and some cut/paste seems to have moved everything around.

You can merge this Pull Request by running

git pull https://github.com/bixolabs/cascading.avro 2.1-bugfix
Or view, comment on, or merge it at:

https://github.com/bixolabs/cascading.avro/pull/18

Commit Summary

Fixed bug where Avro was setting shuffle information when used for input only
Making versions match on the bug branch
Added much better comments, fixed the 'record' field hardcoding in the case of packUnpack being false
Fixed issue with not getting schema from a path
Backported some changes/bugfixes from the 2.2-wip
Fixed PackedAvroScheme to use Fields.FIRST for source/sink.
File Changes

M maven-plugin/pom.xml (2)
M scheme/pom.xml (2)
M scheme/src/main/java/cascading/avro/AvroScheme.java (564)
A scheme/src/main/java/cascading/avro/PackedAvroScheme.java (109)
M scheme/src/test/java/cascading/avro/AvroSchemeTest.java (1254)
A scheme/src/test/resources/cascading/avro/test6.avsc (9)
A scheme/src/test/resources/cascading/avro/words.txt (2)
Patch Links:

https://github.com/bixolabs/cascading.avro/pull/18.patch
https://github.com/bixolabs/cascading.avro/pull/18.diff


Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr


Reply to this email directly or view it on GitHubhttps://github.com/bixolabs/cascading.avro/pull/18#issuecomment-12613902.

Reply to this email directly or view it on GitHub.


Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr

from cascading.avro.

kkrugler avatar kkrugler commented on July 28, 2024

Instead of passing [int.class, long.class, Map.class, String.class, List.class, double.class] you can do something like:
AvroCoercibleMap myStringMap = new AvroCoercibleMap(Schema.Type.String);
AvroCoercibleMap myDoubleList = new AvroCoercibleList(Schema.Type.Double)

Then in the fields API you pass for the class array (need to make it a Type array instead)
[int.class, long.class, myStringMap, myDoubleList]

CoercibleType implements Type so this all works. The coercions can also be used to give back either the Map or Tuple representation depending on what you ask for.

Then we can schema gen from that using the same function that does a schema gen from the cascading Fields.

In fact we should just make a new entry point as well that takes a Fields object with types attached instead of a Fields and a Class[]. Then the current Fields, Class[] constructor will append the class info to the Fields and call the new entry constructor.


Chris

On Jan 23, 2013, at 11:00 AM, Ken Krugler <[email protected]mailto:[email protected]>
wrote:

On Jan 23, 2013, at 10:45am, Chris Severs wrote:

No objections here.

I'm almost done with the full schema gen in 2.2 as well. Just need to put in some more custom coercible type wrappers for the avro types. I think this will work nicely for the normal fields based API as well since the wrappers for map and list allow you to specify the contents so the schema gen can do the right thing.

How might this change with Cascading 2.2 support for type information in Fields?

-- Ken

On Jan 23, 2013, at 10:21 AM, Ken Krugler <[email protected]mailto:[email protected]mailto:[email protected]>
wrote:

Hi Chris,

thanks - I'm hoping Vivek will have some time to merge this in.

BTW, I'd like to move the cascading.avro repo from the bixolabs account over to the Scale Unlimited organization, since it's the only project left in bixolabs (I've migrated everything else).

Would post-merge be a good time to do this?

Thanks,

-- Ken

On Jan 23, 2013, at 12:14am, Chris Severs wrote:

I updated the 2.1 branch with some bugfixes and changed the unpacked api to use a different scheme for clarity. I think this should be version 2.1.1 and we should put a new jar up on conjars.

Apologies for the diffs, new IDE and some cut/paste seems to have moved everything around.

You can merge this Pull Request by running

git pull https://github.com/bixolabs/cascading.avro 2.1-bugfix
Or view, comment on, or merge it at:

https://github.com/bixolabs/cascading.avro/pull/18

Commit Summary

Fixed bug where Avro was setting shuffle information when used for input only
Making versions match on the bug branch
Added much better comments, fixed the 'record' field hardcoding in the case of packUnpack being false
Fixed issue with not getting schema from a path
Backported some changes/bugfixes from the 2.2-wip
Fixed PackedAvroScheme to use Fields.FIRST for source/sink.
File Changes

M maven-plugin/pom.xml (2)
M scheme/pom.xml (2)
M scheme/src/main/java/cascading/avro/AvroScheme.java (564)
A scheme/src/main/java/cascading/avro/PackedAvroScheme.java (109)
M scheme/src/test/java/cascading/avro/AvroSchemeTest.java (1254)
A scheme/src/test/resources/cascading/avro/test6.avsc (9)
A scheme/src/test/resources/cascading/avro/words.txt (2)
Patch Links:

https://github.com/bixolabs/cascading.avro/pull/18.patch
https://github.com/bixolabs/cascading.avro/pull/18.diff


Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr


Reply to this email directly or view it on GitHubhttps://github.com/bixolabs/cascading.avro/pull/18#issuecomment-12613902.

Reply to this email directly or view it on GitHub.


Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr


Reply to this email directly or view it on GitHubhttps://github.com/bixolabs/cascading.avro/pull/18#issuecomment-12616013.

from cascading.avro.

kkrugler avatar kkrugler commented on July 28, 2024

This is a port of https://github.com/bixolabs/cascading.avro/pull/18

Chris - can it be closed out?

from cascading.avro.

kkrugler avatar kkrugler commented on July 28, 2024

Hi Chris - can this be closed now?

from cascading.avro.

ccsevers avatar ccsevers commented on July 28, 2024

Yes definitely. Closing now.

from cascading.avro.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.