GithubHelp home page GithubHelp logo

Comments (8)

visr avatar visr commented on June 24, 2024 2

The first step is definitely changing MetaPoint into Meta{Point} to have a Meta{Any} type.

This sounds like a good move to me. It would be breaking, though I guess we could temporarily define const MetaPoint = Meta{Point}. But if we are going to make changes to Meta it would be good to tackle #48 as well.

EDIT: probably the name Meta is a bad idea though, since that is already a defined module.

from geometrybasics.jl.

piever avatar piever commented on June 24, 2024 1

Came here from JuliaArrays/StructArrays.jl#135 (cc: @Sov-trotter )
I'm slowly wrapping my head around this use case. I am slowly realizing that what I said there may be a bit misleading, as you do not want to do the "struct of array, array of structs" transform on the geometries, but just store an array of custom structs (possibly heterogeneous).

In that case, you can use StructArray just like any other table, so the following would work:

StructArray(
    geometry=[Point(3, 1), Polygon(Point{2, Int}[(3, 1), (4, 4), (2, 4), (1, 2), (3, 1)])],
    city=["Abuja", "Borongan"],
    rainfall=[1221.2, 4114.0],
)

GeometryBasics uses custom types (MetaPoint, MetaPolygon), so I guess one should change to a Meta{T} type so that in the heterogenous case the overall array can be of eltype Meta{Any}. We'd need to check that collect_structarray (which implements widening as needed already) widens correctly to Meta{Any}. Otherwise, I think I could finally add the option to do "custom widening" upon collection: I've wanted that for a while and it should be relatively straightforward.

from geometrybasics.jl.

Sov-trotter avatar Sov-trotter commented on June 24, 2024

Yeah! The above method works. We have done a similar working implementation here.
There's one problem with this approach. We actually construct the geometries initially(the Base.read() methods in Shapefile.jl file) and then have to break them down by using meta()(metadata only) and MetaFree()(geometries only) methods and then put them into StructArray, while what we want is to put intact meta-geometries(metadata + geometries) into the StructArray as a vector.

@visr pointed out how this method of breaking down(not being able to iterate over) "meta-geometries" is a deviation of the basic GeometryBasics idea and how it might create problems in the future when we try put it into Makie(plotting) or performing spatial operations on the data.

So we might keep it as a the last resort in case no other generalization works.

from geometrybasics.jl.

piever avatar piever commented on June 24, 2024

The only tricky thing is that widening over a custom type is a bit ill-defined, as in general it's impossible to know how the parameters should change. The first step is definitely changing MetaPoint into Meta{Point} to have a Meta{Any} type.

Then we can see to what extent the widening of StructArrays works. One possible solution would be to allow custom widening. Alternatively, one could do a flattening of the structure (into a named tuple with geometry and meta data) on the fly while iterating. Then, once all the relevant vectors are created, one can easily transform the columns into a StructArray{Meta{T}} (with essentially no runtime cost).

from geometrybasics.jl.

Sov-trotter avatar Sov-trotter commented on June 24, 2024

@piever suggests that automatically widening for custom types seemed tricky while Nesting / unnesting on the fly is much easier.

using GeometryBasics, StructArrays

function maketable(iter)
    unnested_iter = Base.Generator(iter) do geom_meta
        geom = getfield(geom_meta, :main) # well, the public accessor for this
        metadata = getfield(geom_meta, :meta)
        (; geometry=geom, metadata...) # I think the GeometryBasics name for this field is `:position`
    end
    soa = fieldarrays(StructArray(unnested_iter))
    return meta(soa.geometry; Base.tail(soa)...)
end

point1 = meta(Point(2, 1), city="Delhi", rainfall=121.1)
point2 = meta(Point(2, 1), city="Delhi", rainfall=120)

maketable([point1, point2])

The above example is pretty effective when it comes to heterogeneity in features/geometry even when the MetaData types tend to be inconsistent.
This method doesn't work if soa.geometry widens to Vector{Any}. For that a small refactor in GeometryBasics where MetaPoint, MetaPolygon, etc... become Meta{Point}, Meta{Polygon} etc. is needed so that it can return a Meta{Any} .

But I am unsure whether it is useful to change the @meta_type definition only for the sake of heterogeneous geometries?
Again thanks to @piever, when I mentioned him this concern, he instantly came up with a solution. How would it be to have a metageometry type that contains metadata with geometry type Any, viz. AnyMeta (we can obviously name it better xD). This way we preserve the original homogeneous nature whilst introducing a hetero type. This can be easily done by declaring a AnyMeta using @meta_type macro.
Here's a working example :

point1 = meta(Point(2, 1), city="Delhi", rainfall=121.1)

polygon2 = PolygonMeta(Point{2, Int}[(5, 1), (3, 3), (4, 8), (1, 2), (5, 1)], city="Delhi", rainfall=44)

sa = maketable([point1, polygon2])
2-element AnyMeta{Any,Array{Any,1},(:city, :rainfall),Tuple{Array{String,1},Array{Real,1}}}:
 [2, 1]
 Polygon{2,Int64,Point.....}     

sa.any
2-element Array{Any,1}:
 [2, 1]
 Polygon{......}

sa.rainfall
2-element Array{Real,1}:
 121.1
  44

What do you think @visr, @SimonDanisch ?

from geometrybasics.jl.

Sov-trotter avatar Sov-trotter commented on June 24, 2024

Now that things are getting a bit clear, we have come up with a different approach for handling meta and are slowly working towards it. Also experimenting with StructArrays along the way. What we aim to do currently is put geometry and metadata separately in a Feature struct, and create a iteratable StructArray of Feature structs.
Something like this works well.

using StructArrays, GeometryBasics

struct Feature{Geom, NamedTuple}
    geometry::Geom
    properties::NamedTuple
end

p1 = Point(2, 1)
p2 = Point(3, 2)

sa = StructArray([Feature(Point(1, 0), (city = "Delhi", rainfall = 121)),
                Feature(MultiPoint([p1, p2]), (city = "Goa", rainfall = 1211.1)),
                Feature(Point(1.0, 2.2), (city = "Mumbai", rainfall = 1300))])

But here we leave the NamedTuple untyped with is quite hamering for speed incase of homogeneous types.
I'd be nice if @piever and others could suggest given,

struct Feature{Geom, Names, Types}
    geometry::Geom
    properties::NamedTuple{Names, Types}
end

is there a way to have a StructArray of type StructArray{Feature{Any, String, Float64}}

from geometrybasics.jl.

visr avatar visr commented on June 24, 2024

I'd like to add that Feature here is nothing more than the typed Meta approach suggested, but being worked out outside GeometryBasics for now, in visr/GeoJSONTables.jl#3.

For construction, in most cases it'd be easiest to construct the StructArray from vectors:

StructArray(
    geometry=[Point(3, 1), Polygon(Point{2, Int}[(3, 1), (4, 4), (2, 4), (1, 2), (3, 1)])],
    city=["Abuja", "Borongan"],
    rainfall=[1221.2, 4114.0],
)

Then, once all the relevant vectors are created, one can easily transform the columns into a StructArray{Meta{T}} (with essentially no runtime cost).

How can we do this, given the StructArray defined above?

from geometrybasics.jl.

piever avatar piever commented on June 24, 2024

I also feel that the Meta approach is a bit extreme, and something like your Feature could be safer.
Normally, if you want to create a new StructArray with the same columns but different eltype, you would just do:
StructArray{NewElType}(fieldarrays(oldstructarray)), which shares the columns, so no runtime cost. With Feature, it slightly depends whether you want to store things nested or not nested.

Nested approach

struct Feature{Geom, NamedTuple}
    geometry::Geom
    properties::NamedTuple
end
sa = StructArray(
    geometry=[Point(3, 1), Polygon(Point{2, Int}[(3, 1), (4, 4), (2, 4), (1, 2), (3, 1)])],
    city=["Abuja", "Borongan"],
    rainfall=[1221.2, 4114.0],
)
geom = sa.geometry
metadata = StructArray(Base.tail(fieldarrays(sa)))
type = Feature{eltype(geom), eltype(metadata)}
feature_vec = StructArray{type}((geom, metadata))

I call this "nested" because the second column of feature_vec is itself a StructArray.

Non-nested approach

This is a bit trickier, because you want the layout of the StructArray to be unnested, unlike the layout of Feature. For that, you would need to follow the example here, overloading getproperty, createinstance, and staticschema.

from geometrybasics.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.