GithubHelp home page GithubHelp logo

boathit / deepgtt Goto Github PK

View Code? Open in Web Editor NEW
31.0 3.0 21.0 199 KB

DeepGTT: Learning Travel Time Distributions with Deep Generative Model

Julia 10.25% Jupyter Notebook 75.74% Python 14.02%
travel-time deep-generative-models vae

deepgtt's Introduction

DeepGTT

This repository holds the code used in our WWW-19 paper: Learning Travel Time Distributions with Deep Generative Model.

Requirements

  • Ubuntu OS (16.04 and 18.04 are tested)
  • Julia >= 1.0
  • Python >= 3.6
  • PyTorch >= 0.4 (both 0.4 and 1.0 are tested)

Please refer to the source code to install the required packages in both Julia and Python. You can install packages for Julia in shell as

julia -e 'using Pkg; Pkg.add("HDF5"); Pkg.add("CSV"); Pkg.add("DataFrames"); Pkg.add("Distances"); Pkg.add("StatsBase"); Pkg.add("JSON"); Pkg.add("Lazy"); Pkg.add("JLD2"); Pkg.add("ArgParse")'

Dataset

The dataset contains 1 million+ trips collected by 1,3000+ taxi cabs during 5 days. This dataset is a subset of the one we used in the paper, but it suffices to reproduce the results that are very close to what we have reported in the paper.

git clone https://github.com/boathit/deepgtt

cd deepgtt && mkdir -p data/h5path data/jldpath data/trainpath data/validpath data/testpath

Download the dataset and put the extracted *.h5 files into deepgtt/data/h5path.

Data format

Each h5 file contains n trips of the day. For each trip, it has three fields lon (longitude), lat (latitude), tms (timestamp). You can read the h5 file using the readtripsh5 function in Julia. If you want to use your own data, you can also refer to readtripsh5 to dump your trajectories into the required hdf5 files.

Preprocessing

Map matching

First, setting up the map server and matching server by referring to barefoot.

Then, matching the trips

cd deepgtt/harbin/julia

julia -p 6 mapmatch.jl --inputpath ../data/h5path --outputpath ../data/jldpath

where 6 is the number of cpu cores available in your machine.

Generate training, validation and test data

julia gentraindata.jl --inputpath ../data/jldpath --outputpath ../data/trainpath

cd .. && mv data/trainpath/150106.h5 data/validpath && mv data/trainpath/150107.h5 data/testpath

Training

To run the python code, make sure you have set up the road network postgresql server by referring to the map server setup in barefoot. The road network server (see this file) is used to provide road segment features for the model.

cd deepgtt/harbin/python

python train.py -trainpath ../data/trainpath -validpath ../data/validpath -kl_decay 0.0 -use_selu -random_emit

Testing

python estimate.py -testpath ../data/testpath

Reference

@inproceedings{www19xc,
  author    = {Xiucheng Li and
               Gao Cong and
               Aixin Sun and
               Yun Cheng},
  title     = {Learning Travel Time Distributions with Deep Generative Model},
  booktitle = {Proceedings of the 2019 World Wide Web Conference on World Wide Web,
               {WWW} 2019, San Francisco, California, May 13-17, 2019},
  year      = {2019},
}

deepgtt's People

Contributors

boathit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

deepgtt's Issues

where can i get the whole dataset?

Hello,As you mentioned in the paper,the whole dataset was collected by 13,000 taxis during 28 days in a provincial capital city in China, but you only give 5 days.I wish to develop a model for traffic prediction for my thesis project and I wish to use this dataset. So where can i get the whole dataset?I'd appreciate it if you could help me.

The performance of DeepGTT

I have tested your code in the Didi of Chengdu dataset. However, the performance (MAE-6.12) is worse than DeepTTE(MAE-1.68). However, in your paper, you introduce your results better than DeepTTE. I check the code several times and don't find any errors in preprocessing and model. Can you provide the experimental code for Chengdu DIDI data?

The "road" id in map matching results

Hi Xiucheng,

Thanks for sharing your work. Recently, I tried to rerun your code and encounter one problem. After I built the road network of Harbin, I found the table "bfmap_ways" has a total of 8497 rows, which means the "gid" with a maximum value of 8497. However, some of map matching results, for example,
"Dict{String,Any}("heading" => "forward","frac" => 0.7377781424331177,"route" => "LINESTRING (126.62966262426313 45.691261524537865, 126.62976270000001 45.6908025, 126.6300707 45.6908319, 126.62975139256004 45.692285544498645)","road" => 15054)"

Here the "road" id should be corresponding to gid in "bfmap_ways", but it is of course larger than 8497. So I would like to ask how do you handle this problem?

Kind Regards,
Jilin

Training Data Format

Hi Xiucheng,

Thanks for great work. There is one thing i want to ask is that if i want to skip the map matching step and use my own dataset. Could you please give me some suggestion about the training /testing /validation data format. Thanks.

Kind regards,

Sean

Compare with DeepTTE

Thank you for your public code. I also have one question. I test the DeepGTT model in public Chengdu taxi data, and the travel time prediction result is apparently worse than DeepTTE. Although I have tested your model parameters and checked the preprocessing methods carefully, it also has poor performance. So I will build a github repository to compare these two methods in the next week. And welcome to check my code.

Usage in different area

Hello Xiucheng!
We have been trying to implement DeepGTT on the DiDi dataset of Chengdu, but have encountered an error when running the train.py file; mat1 dim 1 must match mat2 dim 0.

Would you happen to know, what may be causing this error, and how we can proceed?

To adapt to Chengdu instead of Harbin, we have have followed the readme, though with the following differences;

  • Mapmatched using the Barefoot mapmatched with the changes to the import.bash file to cover Chengdu. We converted the DiDi files from .txt to .h5 for usage (in the process converted the GPS coordinates from GCJ to WGS).
  • Run the gentrain.py with the hyper-parameters.JSON file modified to fit Chengdu.

Apologies for the rather vague request, we have not changed much besides these changes. We are in a bit of stump, and have a hard time figuring out how to proceed from this error.

Thank you very much in advance,

Regards Emil

Question about the code

This paper introduces a simple and intuitive deep generative model about the estimating travel time distribution based on trajectory and real-time traffic condition. The idea is intriguing. But, about the codes in the experiment of performance on route recovery, which compared the sparse route recovery algorithm STRS and the travel time learned by DeepGTT, are not released. I want to ask that would you mind offer the route recovery codes for me?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.