GithubHelp home page GithubHelp logo

eyros's Introduction

peermaps

peer to peer cartography

This tool streams raw OpenStreetMap data from p2p networks so that you can perform ad-hoc extracts for arbitrary bounding boxes. Because you are pulling the data from a p2p network (and helping to host it), you also don't need to worry about http quotas or rate limiting.

example

Stream data inside arbitrary WSEN extents from the network:

$ peermaps data -155.064270 18.9136925 -154.8093872 19.9 | head
<?xml version='1.0' encoding='UTF-8'?>
<osm version="0.6" generator="osmconvert 0.8.4" timestamp="2016-11-28T01:59:58Z">
  <bounds minlat="18.9136925" minlon="-155.06427" maxlat="19.9" maxlon="-154.8093872"/>
  <node id="88994815" lat="19.7317131" lon="-155.0533157" version="3" timestamp="2012-01-19T21:23:51Z" changeset="10441415" uid="574654" user="Tom_Holland"/>
  <node id="88994817" lat="19.7312758" lon="-155.0533179" version="3" timestamp="2012-01-19T21:23:51Z" changeset="10441415" uid="574654" user="Tom_Holland"/>
  <node id="88994826" lat="19.7319167" lon="-155.0460457" version="3" timestamp="2012-01-19T21:23:51Z" changeset="10441415" uid="574654" user="Tom_Holland"/>
  <node id="88994829" lat="19.7329599" lon="-155.0463189" version="3" timestamp="2012-01-19T21:23:51Z" changeset="10441415" uid="574654" user="Tom_Holland"/>
  <node id="88994832" lat="19.7333033" lon="-155.0454221" version="3" timestamp="2012-01-19T21:23:51Z" changeset="10441415" uid="574654" user="Tom_Holland"/>
  <node id="88994836" lat="19.7336513" lon="-155.0450981" version="4" timestamp="2012-01-20T23:02:03Z" changeset="10451586" uid="574654" user="Tom_Holland"/>
  <node id="88994868" lat="19.7341231" lon="-155.0447835" version="3" timestamp="2012-01-20T23:02:03Z" changeset="10451586" uid="574654" user="Tom_Holland"/>

install

requirements:

Install the prerequisites, then install the peermaps command:

npm install -g peermaps

Run the ipfs daemon somewhere (in a screen for example):

ipfs daemon

Now you can use the peermaps command.

usage

peermaps data W,S,E,N {OPTIONS}

  Print all data inside the W,S,E,N extents.

  -f      Output format: osm (default), o5m, pbf, csv.
  -n      Network: ipfs (default)
  --show  Print the generated command instead of running it.

peermaps files W,S,E,N

  Print the files from the archive that overlap with the W,S,E,N extents.

  -n      Network: ipfs (default)

peermaps read FILE

  Print the content of FILE from the archive.

  -n      Network: ipfs (default)

peermaps address

  Print the address of the peermaps archive for the given network.

  -n      Network: ipfs (default)

peermaps generate INFILE {OPTIONS}

  Generate a peermaps archive at OUTDIR for INFILE.

  -o OUTDIR   Default: ./mapdata
  -t MAXSIZE  Files must be no greater than MAXSIZE. Default: 1M
  --xmin      Minimum longitude (west). Default: -180
  --xmax      Maximum longitude (east). Default: 180
  --ymin      Minimum latitude (south). Default: -90
  --ymin      Maximum latitude (north). Default: 90
  --xcount    Number of longitude divisions per branch. Default: 4
  --ycount    Number of latitude divisions per branch. Default: 4
  --nproc     Number of converter processes to spawn. Default: (`nproc`-1)

  Example:
    peermaps generate planet-latest.osm.pbf -o ~/data/planet -t 5M

  Note: this operation may take days for planet-sized inputs.

mirror

Help us mirror the archive! If you have a computer with ~38G and network to spare, you can run:

ipfs pin add QmXJ8KkgKyjRxTrEDvmZWZMNGq1dk3t97AVhF1Xeov3kB4

For now there is only one archive hash. In the future, there will be more archives and an update mechanism.

todo

  • generate and host vector tiles on p2p networks
  • dat/hyperdrive support
  • archive update mechanism
  • torrent/webtorrent support?
  • p2p web tile viewer
  • make the generate step much faster by patching osmconvert.c

eyros's People

Contributors

yoshuawuyts avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

eyros's Issues

latency spikes during batch write

Batch writes may take a second or more or even longer for very large datasets. Instead the writes could be written immediately to durable storage and then in a background thread pushed into the LSM forest.

Query database created with ingest

I don't know if it's a bug or if I'm doing something wrong. I create an eyros database with ingest like this (both are on eyros 4.6.1:

cargo +nightly run --release -- ingest --pbf ../../data/berlin-latest.osm.pbf --edb ../edb/

And they try to query it via:

cargo +nightly run --release --example query -- ../edb -180,-90,180,90

I get this error:

thread 'main' panicked at 'range start index 559651500 out of range for slice of length 24054', /eyros/src/bytes/from.rs:154:22

overwrite deleted data

Right now the delete feature marks documents as deleted but it doesn't overwrite them with data later. There should also be an option to write over the data with zeros or noise when deleted.

slow query times and unbounded growth in block sizes

After stress-testing by generating a db with 70 million records, queries get really sluggish. The branches are pretty balanced, but the size of some blocks (found by logging calls to read_block()) get really large. Probably this is because data fragments aren't ever rebuilt into branches while merging.

Missing License?

I've noticed some of the other projects in peermaps have an MIT License, but this project doesn't have any licensing information. Maybe a License should be added?

mixed types

Provide a default implementation for storing scalar and interval types alongside each other in the same db. An example use-case is storing points and polygons without needing to create separate databases.

Errors while compiling: error[E0425]: cannot find function `available_concurrency` in module `std::thread`

Hello,

I've 3 errors while compiling eyros.
It appears that available_concurrency has been renamed available_parallelism: rust-lang/rust@6cc91cb

   Compiling eyros v4.6.0
error[E0425]: cannot find function `available_concurrency` in module `std::thread`
   --> C:\...\eyros-4.6.0\src\tree.rs:530:34
    |
530 |           let nproc = std::thread::available_concurrency().map(|n| n.get()).unwrap_or(1);
    |                                    ^^^^^^^^^^^^^^^^^^^^^ not found in `std::thread`
...
719 |   #[cfg(feature="2d")] impl_tree![Tree2,Branch2,Node2,MState2,get_bounds2,build_data2,
    |  ______________________-
720 | |   (P0,P1),(0,1),(usize,usize),(None,None),2
721 | | ];
    | |__- in this macro invocation
    |
    = note: this error originates in the macro `impl_tree` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0425]: cannot find function `available_concurrency` in module `std::thread`
   --> C:\...\eyros-4.6.0\src\tree.rs:530:34
    |
530 |           let nproc = std::thread::available_concurrency().map(|n| n.get()).unwrap_or(1);
    |                                    ^^^^^^^^^^^^^^^^^^^^^ not found in `std::thread`
...
722 |   #[cfg(feature="3d")] impl_tree![Tree3,Branch3,Node3,MState3,get_bounds3,build_data3,
    |  ______________________-
723 | |   (P0,P1,P2),(0,1,2),(usize,usize,usize),(None,None,None),3
724 | | ];
    | |__- in this macro invocation
    |
    = note: this error originates in the macro `impl_tree` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0425]: cannot find function `available_concurrency` in module `std::thread`
   --> C:\...\eyros-4.6.0\src\tree.rs:530:34
    |
530 |           let nproc = std::thread::available_concurrency().map(|n| n.get()).unwrap_or(1);
    |                                    ^^^^^^^^^^^^^^^^^^^^^ not found in `std::thread`
...
725 |   #[cfg(feature="4d")] impl_tree![Tree4,Branch4,Node4,Mstate4,get_bounds4,build_data4,
    |  ______________________-
726 | |   (P0,P1,P2,P3),(0,1,2,3),(usize,usize,usize,usize),(None,None,None,None),4
727 | | ];
    | |__- in this macro invocation
    |
    = note: this error originates in the macro `impl_tree` (in Nightly builds, run with -Z macro-backtrace for more info)

For more information about this error, try `rustc --explain E0425`.
error: could not compile `eyros` due to 3 previous errors

async

use the new romio async implementation. this should add some extra perf for parallel i/o

variable sized values

Support variable-sized payloads. Eventually it would be good to have variable-sized points too but value payloads are more important for the peermaps roadmap.

variable sized payloads with u16 or u32 length

This is likely already possible with a custom serialization implementation, but it would be nice to have some examples. 8 bytes of length data is overkill for variable-sized payloads that have a known maximum size.

investigate query perf from js

The batch write perf in the wasm build of eyros could be improved but the query performance seems to diverge even more from the rust version. Perhaps this is because batches get sent over the wasm bridge in larger chunks than query results, which stream in one by one?

data block compactness optimization

When incoming data has poor locality, the resulting data blocks (groups of records) tend to span overly large intervals, significantly reducing the quality of the partitioning for each level of the tree. Pre-filtering and post-write optimization steps can improve the quality of the block intervals at the expense of some write performance.

Error trying to read/open db: block too small for length field (rust integration)

Hi!

Great work with the project :)

I have bee using it for a personal project and I'm facing the following error trying to open the DB after about 1gb of data inserted:

Compat { error: ErrorMessage { msg: "block too small for length field" }

About the usage, I don't have any special calls to close the DB. I just reopen the same DB as the app gets launched with the DB::open_from_path method.

The P, V in the db inserts are in the following format:

type P = (f64, f64, u8, i64);
type V = (u8, u8);
Row::Insert(point, value);

The data is just test data, not important, but i would like to know how/if I'm doing something wrong or how can it be prevented.

Br,
J

block cache

finish the block cache for reads (lru) + writes (hash map). hopefully this will speed up both cold reads and queries once the cache is primed

Deletion example

It's not obvious how to delete rows. Would be nice to have example of deletion.
P.S.Thank you for the lib

wasm version

This database already ought to compile to wasm, but we should also have a good API for using eyros in the browser using browser storage.

No longer compiles on Nightly

Unfortunately I think this has crept into nightly recently and now Eyros no longer builds

error[E0407]: method `backtrace` is not a member of trait `std::error::Error`
  --> /Users/alex/.cargo/registry/src/github.com-1ecc6299db9ec823/eyros-4.6.2/src/error.rs:31:3
   |
31 | /   fn backtrace(&'_ self) -> Option<&'_ Backtrace> {
32 | |     Some(&self.backtrace)
33 | |   }
   | |___^ not a member of trait `std::error::Error`

For more information about this error, try `rustc --explain E0407`.
error: could not compile `eyros` due to previous error

merge databases

Merge multiple databases together. This should work without requiring the presence of the raw data so that the results from multiple computers can be combined together without transferring the data file (big), only the tree data (small) plus ranges for the data blocks (unknown size, probably smaller than big but bigger than small).

Error when opening current example

When opening the current example the map fails with the following error in Brave browser. Tried with both:
https://ipfs.io/ipfs/QmS24zmgDz2jFdakvd6aT6sRXSGRXWJaB62aPTbvmpguBB/
ipfs://bafybeibwvqwjcptcsl5zm4gedug5mlfsltvcirxffm2zgjr5hudxct6jmy/

Uncaught (in promise) RangeError: WebAssembly.Compile is disallowed on the main thread, if the buffer size is larger than 4KB. Use WebAssembly.compile, or compile on a worker thread.
    at module.exports ((index):3208)
    at module.exports ((index):2701)
    at onDone ((index):6630)
    at notifyProgress ((index):11787)
    at onReadyStateChange ((index):11521)
    at XMLHttpRequest.xhr.onreadystatechange ((index):11471)

documentation

however rust does it, plus the same info in the readme

stuck in a loop

Under some conditions the database hangs while writing and one of the tree files grows unbounded in size. I suspect this is some edge case in determining whether to build a new branch.

Question: Will this repository be updated?

i see that the crate for eyros is at 4.6.2 but this repository reflects 4.6.1. the ./pkg/package.json also appears out of sync with npm. will this repo be seeing any further commits?

i've been following the project for a while and trying things out as they become available on github & npm. if there is another place to follow along, i would be interested to know where it is!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.