Comments (9)
Example notebook has been made by Thijs. The next step is to implement the Morton order as a function
from stmtools.
So, a function for reordering should be made part of the stm extension to xarray (in stmtools.git: stmtools/stm.py).
Ideally, only evaluate the point coordinates to reduce the strain on memory (delayed processing).
from stmtools.
It could be that any reordering operation on an xarray will have to evaluate all the point attributes. In this case, we may have to also implement some sort of redirection array (with only x, y, and index in the original array).
from stmtools.
I looked at a few light-weight Morton ordering python tools.
A very generic and simple one is trevorprater/pymorton. This one has two disadvantages though:
- if you want to order lat-lon pairs, the output is a base-4 string.
- More importantly, you cannot specify the precision. This is always set to 32 or 64 bits depending on your system.
There are several geohashing python tools. The one that is currently most popular is https://pypi.org/project/python-geohash/
This package is less generic in that it expects lat-lon pairs (which is fine for out purpose), it does allow setting the precision, and it outputs the hash in base-32. Additionally, the functional part is implemented in C++ as opposed to pyhton.
Unfortunately, it is poorly documented, but this need not be a problem, because of its limited scope and straigtforward functionality.
I will have to check whether computation time could be a limiting factor for either tool.
from stmtools.
Once the ordering hash/index is computed, the sorting can be done by the sortby
function.
In case the single column of the ordering hash/index is too big to persist in the memory, we can first write the ordering index using the older chunks to disk, then reload the whole dataset lazily, finally sort by the lazy index.
from stmtools.
We also decided to (initially) sort by image (pixel) coordinate.
This has the advantage of being a local coordinate system (less precision needed to fully specify each point)
and allowing producing an integer hash more intuitively.
from stmtools.
Also, we briefly discussed the timing of the sorting procedure.
Ideally this should be done immediately after pixel selection to prevent writing data chunks that will have to be overwritten after sorting.
However, we also need to be able to work with pre-existing data that is already chunked.
Maybe this means there should be two sorting procedures, or at least two ways of commencing the sorting.
from stmtools.
Example delayed funtion: stm/py:enrich_from_polygon -> xr.map_blocks(...)
Better yet: sarxarray/stack.py:_get_phase(...) -> da.apply_gufunc(...)
Note, apply_gufunc expects the name of the function, the 'signature', the list of function arguments and then the meta dtype for the output of the function.
from stmtools.
Function added via #56. Documentation need to be added (#57).
from stmtools.
Related Issues (20)
- software release HOT 3
- Extract time information in `from_csv` function
- From CSV to Zarr HOT 1
- Controlling output chunk size and type HOT 1
- Subset function: make thresholding lazy
- Doc: Add documentation for `get_order` and `re_order` HOT 1
- Warnings in Unit Test
- Add performance test: should we add it? HOT 1
- A function to initiate an STM from scratch
- Querying temporal attributes to an STM HOT 3
- Investigate catalog method for contextual data archiving HOT 6
- Add API documentation
- How to handle the time reference system
- How to handle the Spatial Coordinate Reference system
- Github workflow in a pull request should only run on changed files
- Deprecation warning for Jupyter in mkdocs
- Performance warning in re-order
- Future warning `Dataset.dims`
- Unit test for duplicated points in an STM HOT 1
- Documentation for `enrich_from_dataset` function
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stmtools.