GithubHelp home page GithubHelp logo

rajatsen91 / deepglo Goto Github PK

View Code? Open in Web Editor NEW
166.0 166.0 47.0 5.32 MB

This repository contains code for the paper: https://arxiv.org/abs/1905.03806. It also contains scripts to reproduce the results in the paper.

License: Other

Python 98.61% Shell 1.39%

deepglo's People

Contributors

rajatsen91 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepglo's Issues

Initializing Factors.....

hi, i have a question 。
if i run the code python3.5 run_scripts/run_pems.py --normalize True

(228, 12672)
Initializing Factors.....
This goes on for days. Why? What should I do,thanks

how to properly preprocess the raw data?

Hey guys, really impressive work and thanks for sharing the code.

We're trying to use DeepGLO to process datasets other than the four used in the paper, and kind of got stuck at the preprocessing stage. It would be great if you could share any specification or scripts about how to properly preprocess the raw data from the public datasets used in the paper.

It seems there's much difference between the original data and processed data (eg. electricy.npy, etc.). For example, I have downloaded raw electricity data from https://archive.ics.uci.edu/ml/datasets/ElectricityLoadDiagrams20112014, and did resample and fillna as follows.

df = raw_data.resample('1H',label = 'left',closed = 'right').sum()
df.fillna(0, inplace=True)

The last 10 data points of the 1st series, i.e. "MT_001" in the original dataset looks below:

2014-09-07 14:00:00    63.451777
2014-09-07 15:00:00    60.913706
2014-09-07 16:00:00    58.375635
2014-09-07 17:00:00    62.182741
2014-09-07 18:00:00    77.411168
2014-09-07 19:00:00    36.802030
2014-09-07 20:00:00    13.959391
2014-09-07 21:00:00    46.954315
2014-09-07 22:00:00    65.989848
2014-09-07 23:00:00    65.989848

On the other hand, the last 10 datapoints of the 1st series in the "electricity.npy" looks like below. Apparently the values are much different from the original time series values.

array([3.8071, 3.8071, 5.0761, 6.3452, 6.3452, 7.6142, 7.6142, 7.6142,
       7.6142, 7.6142])

Maybe I've missed something here...
It would be really helpful if you could share how this electricity.npy is processed from the raw data as above.

[question] please tell me how to generate electricity.npy

Thanks for releasing your source code.

I have a question about how to generate dataset.
When we download electricity dataset from link in your paper,
there is txt file(LD2011_2014.txt) with data every 15 minutes from 2011-01-01 to 2015-01-01.

So, I wonder how to convert this .txt file to .npy in your google drive.
Please tell me how to generate electricity.npy.

Best Regards.

Queries on this Model

  1. I have used the below python file
    https://github.com/intel-analytics/analytics-zoo/blob/master/pyzoo/zoo/zouwu/examples/run_electricity.py
    to process the traffic.npy which was taken from the below link
    https://github.com/rajatsen91/deepglo/blob/master/datasets/download-data.sh
    But the npy files doesn’t have any headers and contains only array data. Also from the python file we could not conclude which parameters it do a training. Could you explain more on this or any available document to share for that ?

  2. When we executed training for the traffic model available https://github.com/rajatsen91/deepglo/tree/master/
    python run_scripts/run_traffic.py --normalize True it throws that error “RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx” due to NVIDIA graphics card was not available in my system. Is there any other hooks available to train this model ?

which predictions are the final predictions?

In the paper, the authors commented that the final global predictions can be made with FX(te) where X(te) is forecasted with the local model. Are these final global predictions also the DeepGLO predictions? That is the proposed model?
The confusion is because in the code, there is Wape global and only wape, which is the DeepGLO wape?

[question] creating global covariates

Hi, thanks for sharing your code.

I have a question about creating global covariates.

In the prediction, global covariates are calculated using F*Tx(X) as stated in the paper.

deepglo/DeepGLO/DeepGLO.py

Lines 619 to 630 in 54e0644

yc = self.predict_global(
ind=ind,
last_step=last_step,
future=future,
cpu=cpu,
normalize=False,
bsize=bsize,
)
if self.period is None:
ycovs = np.zeros(shape=[yc.shape[0], 1, yc.shape[1]])
if self.forward_cov:
ycovs[:, 0, 0:-1] = yc[:, 1::]

However, In training, global covariate seems to be generated by Tx using input sequence directly instead of using factorized F and X (i.e. F*X). Is there any reason for this?

https://github.com/rajatsen91/deepglo/blob/54e0644d764f1ead65d4203b72c8634e2f6ea25e/DeepGLO/DeepGLO.py#L510-520

Best Regards.

Piece of Code

Hey ,
Would yo tell what is this :
self.val_index = np.random.randint(0, n - self.vbsize - 5)
Because in each batch the indexing is failed.

Factor training losses not contracting

Hi, this is an impressive project, and thanks for sharing w/ community!

I've been trying to learn the model with my test data. The test data has about 70 samples, each with about 2300 timesteps.

However, in final stage, the Recovery Loss in Rolling Validation is getting bigger and bigger each round, and early stopped at 0.308, which caused the much worse wape and wape_global metrics than the baseline:
{'wape': 0.39331427, 'mape': 0.36864823, 'smape': 0.4937316, 'mae': 5.487852, 'rmse': 8.775283, 'nrmse': 0.47228432, 'wape_global': 0.582235, 'mape_global': 0.56812644, 'smape_global': 0.84549224, 'mae_global': 8.123833, 'rmse_global': 11.685119, 'nrmse_global': 0.47228432, 'baseline_wape': 0.11834013, 'baseline_mape': 0.11296055, 'baseline_smape': 0.11496856}

Could you provide some insights how can I improve the training and get better resulsts?

Thanks!!!

===========================================================

Last round of Recovery Loss stats:
GLO: rolling_validation(): Current window wape: 0.5014139
GLO: recover_future_X(): Recovery Loss(0/100000): 1.002367615699768
GLO: recover_future_X(): Recovery Loss(1000/100000): 0.628299355506897
GLO: recover_future_X(): Recovery Loss(2000/100000): 0.4282535910606384
GLO: recover_future_X(): Recovery Loss(3000/100000): 0.3461550176143646
GLO: recover_future_X(): Recovery Loss(4000/100000): 0.3201618790626526
GLO: recover_future_X(): Recovery Loss(5000/100000): 0.310817688703537
GLO: recover_future_X(): Recovery Loss(6000/100000): 0.3080223500728607
GLO: recover_future_X(): Recovery Loss(7000/100000): 0.3077664375305176
GLO: rolling_validation(): Current window wape: 0.45383096

In addition, Factorization Loss F, Factorization Loss X, Validation Loss ended at (0.214, 0.205, 0.294) level, early stopped, while Temporal Loss hovered around 0.017 level.

Training of Xseq and Yseq, (training loss, validation loss) progress down to (0.074, 0.052) and (0.054, 0.021) respectively.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.