Comments (7)
Hi Chris,
thanks - I'm hoping Vivek will have some time to merge this in.
BTW, I'd like to move the cascading.avro repo from the bixolabs account over to the Scale Unlimited organization, since it's the only project left in bixolabs (I've migrated everything else).
Would post-merge be a good time to do this?
Thanks,
-- Ken
On Jan 23, 2013, at 12:14am, Chris Severs wrote:
I updated the 2.1 branch with some bugfixes and changed the unpacked api to use a different scheme for clarity. I think this should be version 2.1.1 and we should put a new jar up on conjars.
Apologies for the diffs, new IDE and some cut/paste seems to have moved everything around.
You can merge this Pull Request by running
git pull https://github.com/bixolabs/cascading.avro 2.1-bugfix
Or view, comment on, or merge it at:https://github.com/bixolabs/cascading.avro/pull/18
Commit Summary
Fixed bug where Avro was setting shuffle information when used for input only
Making versions match on the bug branch
Added much better comments, fixed the 'record' field hardcoding in the case of packUnpack being false
Fixed issue with not getting schema from a path
Backported some changes/bugfixes from the 2.2-wip
Fixed PackedAvroScheme to use Fields.FIRST for source/sink.
File ChangesM maven-plugin/pom.xml (2)
M scheme/pom.xml (2)
M scheme/src/main/java/cascading/avro/AvroScheme.java (564)
A scheme/src/main/java/cascading/avro/PackedAvroScheme.java (109)
M scheme/src/test/java/cascading/avro/AvroSchemeTest.java (1254)
A scheme/src/test/resources/cascading/avro/test6.avsc (9)
A scheme/src/test/resources/cascading/avro/words.txt (2)
Patch Links:https://github.com/bixolabs/cascading.avro/pull/18.patch
https://github.com/bixolabs/cascading.avro/pull/18.diff
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr
from cascading.avro.
No objections here.
I'm almost done with the full schema gen in 2.2 as well. Just need to put in some more custom coercible type wrappers for the avro types. I think this will work nicely for the normal fields based API as well since the wrappers for map and list allow you to specify the contents so the schema gen can do the right thing.
On Jan 23, 2013, at 10:21 AM, Ken Krugler <[email protected]mailto:[email protected]>
wrote:
Hi Chris,
thanks - I'm hoping Vivek will have some time to merge this in.
BTW, I'd like to move the cascading.avro repo from the bixolabs account over to the Scale Unlimited organization, since it's the only project left in bixolabs (I've migrated everything else).
Would post-merge be a good time to do this?
Thanks,
-- Ken
On Jan 23, 2013, at 12:14am, Chris Severs wrote:
I updated the 2.1 branch with some bugfixes and changed the unpacked api to use a different scheme for clarity. I think this should be version 2.1.1 and we should put a new jar up on conjars.
Apologies for the diffs, new IDE and some cut/paste seems to have moved everything around.
You can merge this Pull Request by running
git pull https://github.com/bixolabs/cascading.avro 2.1-bugfix
Or view, comment on, or merge it at:https://github.com/bixolabs/cascading.avro/pull/18
Commit Summary
Fixed bug where Avro was setting shuffle information when used for input only
Making versions match on the bug branch
Added much better comments, fixed the 'record' field hardcoding in the case of packUnpack being false
Fixed issue with not getting schema from a path
Backported some changes/bugfixes from the 2.2-wip
Fixed PackedAvroScheme to use Fields.FIRST for source/sink.
File ChangesM maven-plugin/pom.xml (2)
M scheme/pom.xml (2)
M scheme/src/main/java/cascading/avro/AvroScheme.java (564)
A scheme/src/main/java/cascading/avro/PackedAvroScheme.java (109)
M scheme/src/test/java/cascading/avro/AvroSchemeTest.java (1254)
A scheme/src/test/resources/cascading/avro/test6.avsc (9)
A scheme/src/test/resources/cascading/avro/words.txt (2)
Patch Links:https://github.com/bixolabs/cascading.avro/pull/18.patch
https://github.com/bixolabs/cascading.avro/pull/18.diff
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr
—
Reply to this email directly or view it on GitHubhttps://github.com/bixolabs/cascading.avro/pull/18#issuecomment-12613902.
from cascading.avro.
On Jan 23, 2013, at 10:45am, Chris Severs wrote:
No objections here.
I'm almost done with the full schema gen in 2.2 as well. Just need to put in some more custom coercible type wrappers for the avro types. I think this will work nicely for the normal fields based API as well since the wrappers for map and list allow you to specify the contents so the schema gen can do the right thing.
How might this change with Cascading 2.2 support for type information in Fields?
-- Ken
On Jan 23, 2013, at 10:21 AM, Ken Krugler <[email protected]mailto:[email protected]>
wrote:Hi Chris,
thanks - I'm hoping Vivek will have some time to merge this in.
BTW, I'd like to move the cascading.avro repo from the bixolabs account over to the Scale Unlimited organization, since it's the only project left in bixolabs (I've migrated everything else).
Would post-merge be a good time to do this?
Thanks,
-- Ken
On Jan 23, 2013, at 12:14am, Chris Severs wrote:
I updated the 2.1 branch with some bugfixes and changed the unpacked api to use a different scheme for clarity. I think this should be version 2.1.1 and we should put a new jar up on conjars.
Apologies for the diffs, new IDE and some cut/paste seems to have moved everything around.
You can merge this Pull Request by running
git pull https://github.com/bixolabs/cascading.avro 2.1-bugfix
Or view, comment on, or merge it at:https://github.com/bixolabs/cascading.avro/pull/18
Commit Summary
Fixed bug where Avro was setting shuffle information when used for input only
Making versions match on the bug branch
Added much better comments, fixed the 'record' field hardcoding in the case of packUnpack being false
Fixed issue with not getting schema from a path
Backported some changes/bugfixes from the 2.2-wip
Fixed PackedAvroScheme to use Fields.FIRST for source/sink.
File ChangesM maven-plugin/pom.xml (2)
M scheme/pom.xml (2)
M scheme/src/main/java/cascading/avro/AvroScheme.java (564)
A scheme/src/main/java/cascading/avro/PackedAvroScheme.java (109)
M scheme/src/test/java/cascading/avro/AvroSchemeTest.java (1254)
A scheme/src/test/resources/cascading/avro/test6.avsc (9)
A scheme/src/test/resources/cascading/avro/words.txt (2)
Patch Links:https://github.com/bixolabs/cascading.avro/pull/18.patch
https://github.com/bixolabs/cascading.avro/pull/18.diff
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr—
Reply to this email directly or view it on GitHubhttps://github.com/bixolabs/cascading.avro/pull/18#issuecomment-12613902.
—
Reply to this email directly or view it on GitHub.
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr
from cascading.avro.
Instead of passing [int.class, long.class, Map.class, String.class, List.class, double.class] you can do something like:
AvroCoercibleMap myStringMap = new AvroCoercibleMap(Schema.Type.String);
AvroCoercibleMap myDoubleList = new AvroCoercibleList(Schema.Type.Double)
Then in the fields API you pass for the class array (need to make it a Type array instead)
[int.class, long.class, myStringMap, myDoubleList]
CoercibleType implements Type so this all works. The coercions can also be used to give back either the Map or Tuple representation depending on what you ask for.
Then we can schema gen from that using the same function that does a schema gen from the cascading Fields.
In fact we should just make a new entry point as well that takes a Fields object with types attached instead of a Fields and a Class[]. Then the current Fields, Class[] constructor will append the class info to the Fields and call the new entry constructor.
Chris
On Jan 23, 2013, at 11:00 AM, Ken Krugler <[email protected]mailto:[email protected]>
wrote:
On Jan 23, 2013, at 10:45am, Chris Severs wrote:
No objections here.
I'm almost done with the full schema gen in 2.2 as well. Just need to put in some more custom coercible type wrappers for the avro types. I think this will work nicely for the normal fields based API as well since the wrappers for map and list allow you to specify the contents so the schema gen can do the right thing.
How might this change with Cascading 2.2 support for type information in Fields?
-- Ken
On Jan 23, 2013, at 10:21 AM, Ken Krugler <[email protected]mailto:[email protected]mailto:[email protected]>
wrote:Hi Chris,
thanks - I'm hoping Vivek will have some time to merge this in.
BTW, I'd like to move the cascading.avro repo from the bixolabs account over to the Scale Unlimited organization, since it's the only project left in bixolabs (I've migrated everything else).
Would post-merge be a good time to do this?
Thanks,
-- Ken
On Jan 23, 2013, at 12:14am, Chris Severs wrote:
I updated the 2.1 branch with some bugfixes and changed the unpacked api to use a different scheme for clarity. I think this should be version 2.1.1 and we should put a new jar up on conjars.
Apologies for the diffs, new IDE and some cut/paste seems to have moved everything around.
You can merge this Pull Request by running
git pull https://github.com/bixolabs/cascading.avro 2.1-bugfix
Or view, comment on, or merge it at:https://github.com/bixolabs/cascading.avro/pull/18
Commit Summary
Fixed bug where Avro was setting shuffle information when used for input only
Making versions match on the bug branch
Added much better comments, fixed the 'record' field hardcoding in the case of packUnpack being false
Fixed issue with not getting schema from a path
Backported some changes/bugfixes from the 2.2-wip
Fixed PackedAvroScheme to use Fields.FIRST for source/sink.
File ChangesM maven-plugin/pom.xml (2)
M scheme/pom.xml (2)
M scheme/src/main/java/cascading/avro/AvroScheme.java (564)
A scheme/src/main/java/cascading/avro/PackedAvroScheme.java (109)
M scheme/src/test/java/cascading/avro/AvroSchemeTest.java (1254)
A scheme/src/test/resources/cascading/avro/test6.avsc (9)
A scheme/src/test/resources/cascading/avro/words.txt (2)
Patch Links:https://github.com/bixolabs/cascading.avro/pull/18.patch
https://github.com/bixolabs/cascading.avro/pull/18.diff
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr—
Reply to this email directly or view it on GitHubhttps://github.com/bixolabs/cascading.avro/pull/18#issuecomment-12613902.
—
Reply to this email directly or view it on GitHub.
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr
—
Reply to this email directly or view it on GitHubhttps://github.com/bixolabs/cascading.avro/pull/18#issuecomment-12616013.
from cascading.avro.
This is a port of https://github.com/bixolabs/cascading.avro/pull/18
Chris - can it be closed out?
from cascading.avro.
Hi Chris - can this be closed now?
from cascading.avro.
Yes definitely. Closing now.
from cascading.avro.
Related Issues (20)
- What should we do with the Maven plugin? HOT 2
- Request remove limitation "Unions may only consist of a concrete type and null in cascading.avro" HOT 1
- Get a clojure map output from cascading.avro tap HOT 1
- add support for selective field reading
- Add support for custom conversion HOT 5
- Bug PackedAvroScheme giving incorrect result for nested record avro schema HOT 3
- Do 2.2.0 release? HOT 1
- NoSuchMethodException when using AvroScheme sink HOT 7
- Dependency on cascading-hadoop rather than cascading-hadoop2 HOT 2
- Support for (Specific|Generic)Records inside Tuples HOT 1
- Add Eclipse formatter file to define code style
- Downgrade Avro dependency to 1.7.4? HOT 12
- Java version requirement inconsistency in version-2.6 branch HOT 1
- avro-scheme generates untyped Cascading Fields objects HOT 1
- 2.6-SNAPSHOT map/array behavior does not correspond with README HOT 1
- Switch Cascading dependencies in pom.xml to provided scope HOT 1
- Add unit test to verify support for arrays in AvroScheme
- Issue in handling ["null","boolean|Double|Integer"] types in avro schema HOT 6
- Make scheme compatible with Cascading 3.x HOT 2
- Support for Union types with more than one concrete types(excluding nulls). HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cascading.avro.