Comments (2)
I already wrote a code snippet to efficently (in terms of runtime) index all edges in a specific time window (i.e. for get_snapshot
in my understanding):
data_min_t = data.t.min().item()
data.t = data.t - data_min_t
unique_t, t_counts = data.t.unique(return_counts=True)
# For each consecutive pair of timestamps, we need to know how many timestamps are missing in between
missing_steps = unique_t[1:] - unique_t[:-1]
# Create a pointer that you can index with each timestamp and points to the position in the edge_index where this specific timestamp starts
data.ptr = torch.repeat_interleave(
cumsum(t_counts),
torch.cat(
[
torch.ones(1, dtype=torch.int, device=data.t.device),
missing_steps,
torch.ones(1, dtype=torch.int, device=data.t.device),
]
),
)
Here I assume that data is a TemporalData
-object from PyG and the edge index is sorted by time. I also assume the timestamps to be integers and remap them to start at 0 for simplicity. We could also start at any time stamp but this could potentially waste a lot of memory since we would save a ptr
for potentially many timestamps that never occur.
With the code above, get_snapshot
could look as follows:
def gets_snapshot(start, end):
return data[data.ptr[start]:data.ptr[end]]
We could also (if we do not want to reindex but still start at the minimum) do this:
def gets_snapshot(start, end):
return data[data.ptr[start-data_min_t]:data.ptr[end-data_min_t]]
By trading of runtime for memory efficiency, we could also search for the correct pointers in the sorted timestamp tensor t
. I think if only used once, this is probably preferable, but if we want to use this as an iterator with a rolling time window, the first approach could save a lot of time for large datasets.
from pathpyg.
So there is a PyG method TemporalData.snapshot(...)
which should work the same way as our intended get_snapshot(...)
, but it was not working for me when I tried it. Now I found the reason why: pyg-team/pytorch_geometric#3230
I.e. snapshot(...)
is implemented in Data
and not TemporalData
. TemporalData
will not be supported for much longer and will be deprecated in the future. This is also the reason why sort_by_time
and other time-related methods did not work for me before when tested with TemporalData
because TemporalData
just inherited it from the implementation in Data
.
Long Story Short: We can do something like this:
def gets_snapshot(start, end):
return data.snapshot(start, end) # data needs to be a PyG Data object, not a TemporalData object!
from pathpyg.
Related Issues (20)
- Unintended Boolean Conversions and Ordering Changes in `IndexMap`
- Extend `IndexMap.to_idxs(...)` to any shape
- Refactor `lift_order`-logic from `MultiOrderModel` to `algorithms`
- Change filenames
- Add convenience method to `MultiOrderModel` to create a `PyG`-`Data` object that can be used by DBGNN HOT 1
- Switch from Global `device`-`config` to `tensor`-wise configuration
- Extend `append_DAG` to work with nodes appearing at multiple points in time
- Setter for node/edge attributes not working
- Plot to tikz/pdf currently not working HOT 1
- Implement functions to read/write Graphs to CSV
- inefficient calculation of temporal paths HOT 1
- How should we handle `num_nodes`? HOT 3
- Should we implement a zeroth order layer in MultiOrderModel?
- `PyG` is going to deprecate `TemporalData`
- import and plotting fail for (some) netzschleuder dataset
- Decrease memory footprint in `MultiOrderModel`
- Refactor DAGData and MultiOrderModel
- Improve use of IndexMap in `Graph.from_edge_list`
- Port new functions from ML4Nets course
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pathpyg.