signaflo / java-timeseries Goto Github PK

View Code? Open in Web Editor NEW

193.0 193.0 49.0 4.5 MB

Time series analysis in Java

License: MIT License

Java 100.00%

forecasting forecasts java time-series timeseries

java-timeseries's People

Contributors

Stargazers

Watchers

java-timeseries's Issues

Save trained model

Hi, thanks for your implementation it has been very useful for me.

I just wanna know if there is a way to save an ARIMA model on disk for future predictions.

Tutorials, Examples Documenation page

It would be nice if there was a formal documentation somewhere with a "quick start" etc.

I'm not a Java programmer and other non-java programmers could quickly see how to use the library.

What I mean by documentation is not formal api documentation but rather tutorials and examples.

I would like to use the library for implementing ARIMA models for some data.

Arima forecast requires complete time series

Currently, one needs to provide the complete time series, in order to build an ARIMA model and subsequently obtain forecasts. When used in a streaming application, so that the time series is of possibly infinite length, this approach becomes unfeasible.

Let me shortly introduce my use case:
I'm trying to use this framework in an application, where I consume a time series in a streaming fashion. Using window based aggregation, I downsample the series to exactly one value / 15 minutes. Given already fitted coefficients, I need to forecast one step ahead.
When evaluating the forecast method, I have access to the last p time series values and the last q errors¹ (but not the whole time series).

What I popose is a method to allow forecasting trough a "partial" ARIMA model given by

model(TimeSeries lastP_Observations, TimeSeries lastQ_Errors, ArimaCoefficients coeffs)

¹ The errors are basically obtained by joining the time series stream with the time shifted forecast stream and calculating the differences.

Scala: math package conflicting with scala.math

It's very common for scala developers to use the scala.math package without the scala prefix as it's imported by default. Naming your package math conflicts with this such that simply adding java-timeseries as a dependency will break the build. Maybe you should have prefixed it with your own vendor prefix to avoid these conflicts?

Maximum step reductions, 25

Hello,
When I try to predict with ARIMA in some timeseries I get this WARN:

40 [main] WARN com.github.signaflo.math.optim.BFGS - Maximum step reductions, 25

and the prediction is not calculated and appears as NaN. For example:

I have a TS with 20 items, when I try to predict one value with ARIMA(1,0,0), the warm appears. However if I change to 21 items, it doesn't.

My code:

        timeSeries = Ts.newAnnualSeries(1975, DoubleFunctions.arrayFrom(data));
        modelOrder = ArimaOrder.order(p, d, q);
        model = Arima.model(timeSeries, modelOrder);
        Forecast forecast = model.forecast(1);
        TimeSeries prediccion = forecast.pointEstimates();
        System.out.println(prediccion.asList().get(0));

My TS

1.339910000000000082e+03
1.569730000000000018e+03
1.751859999999999900e+03
1.965369999999999891e+03
2.246050000000000182e+03
2.495900000000000091e+03
2.724000000000000000e+03
2.650000000000000000e+03
2.679699999999999818e+03
2.794699999999999818e+03
2.828500000000000000e+03
3.521099999999999909e+03
3.931500000000000000e+03
4.941399999999999636e+03
5.388399999999999636e+03
5.766199999999999818e+03
5.937899999999999636e+03
5.627399999999999636e+03
5.982699999999999818e+03
5.931600000000000364e+03

Any ideas?

Thanks.

What if my time series has missing values

Is there a way to use this model if my time series has some values missing?

Implement causality, stationarity, and invertibility checks

Clients should be able to see whether the model they built is causal, stationary, and invertible.

Positive time series value predictions

Hi
Is there any way to restrict prediction values for positive values. Because the time series that I'm working can have only positive values. But the ARIMA model time series forecast gives me negative values some times.

ARIMA is unstable on incomplete seasons

I am making ARIMA predictions with yearly seasonality and data. with 24 months of history, the prediction is as good as cn be expected with s little data, but if I use 25 months of history, the prediction is very unstable (slightly modifying one measure can result in a massive trend change).

Example input : 10 | 11 | 12 | 8 | 6 | 4 | 5 | 6 | 6 | 5 | 4 | 3 | 2 | 2.5 | 3 | 2 | 1 | 0.5 | 0.1 | 0 | 1 | 0.2 | 1 | 0 | 0.3 | 0.4 | 0.2
with the following ARIMA parameters :
p = 0
d = 1
q = 1
aP = 0
aD = 1
aQ = 1,
the previsions are increasing (athough the history is clearly decreasing). Changing the second value (the 11) into a 13 or a 9 makes the predictions seem correct.

When I perform the same prediction with only the 24 first value, the prediction is not as sensitive, and changing one input value only marginally affects the prediction.

See screenshot :

I suspect this is due to this line in ArimaModel.java :
this.differencedSeries = observations.difference(1, order.d()).difference(seasonalFrequency, order.D());

help with regression model

Hi, first of all I would like to thank you for the library you developped for java because it was a great help for me in understanding how arima and regression model works.
I would like to know how to implement a regression model because I didn't find it on the wiki of the project.
Best regards.

predict by day

Hello,after i read the API documentation(https://javadoc.io/doc/com.github.signaflo/timeseries/0.4),I found that the new TimeSeries only have weekly、monthly or quarterly predict,but I want to do the model predicted by the day,my data changes over the weekend(it low in weekend,high in workday),is there has any other way to predict my data using ARIMA?Did i use TimeSeries.from(double[] d) works out best predict?waiting for your replay,@signaflo

NEED Some help about getting started your ARIMA API!thx(*￣︶￣)

Hi, @signaflo !
Sorry to bother you. I have to say you've done so wonderful work that it helped a lot on ARIMA. As you know , there are so few materials about Arima in Java, no matter the realization of the model. Your work has benefited me a lot.
BUT, I still get some problems and I sincerely ask for your help. The question is --- as my ability is not so good ,I could find the main function about the use of the Arima model to predict (I mean the specific process ), so can you tell me how to use your API to run an example?
My second question is, is there function considering checking the time series's stability, which is the foudation of the use of Arima.
I will appreciate your reply if you can spare some time to help me.
Good Day!

ArimaCoefficients need refactoring and public access

The ArimaCoefficients is a simple structure containing arrays of doubles representing the AR and MA parameters in addition to the mean/drift terms.

There are a few issues. One is that we need to differentiate between a coefficient that is fixed and known, and therefore has no standard error, and a coefficient that either has been or is in the process of being estimated, and therefore should have an associated uncertainty measure.

Another issue is that the only way to obtain information about standard errors of the coefficients in the Arima class is by getting a flat array. Knowing which coefficient corresponds to which standard error in the array is then a matter of guesswork.

Finally, the coefficients in the Arima class are not public, which has been a problem for many users.

To fix these issues, the following tasks should be completed:

Make coefficient objects public, so they can be accessed from the Arima class.
Create a coefficient object hierarchy to represent model coefficients.
The hierarchy should allow us to differentiate between fixed and estimated standard errors.

It's possible that changes should be made to the Arima class itself. For example, an ARIMA model whose coefficients are fixed is different from an ARIMA model whose coefficients are estimated through an optimization routine, but it is unclear whether that difference needs to be explicitly represented in the code structure.

How to resample a time series data?

Hi there i got a time series data,which i want to resample by given time interval by two methods i.e by close and average.
is there any possibility that we can achieve this like we have functionality in python pandas

dataframe.resample('3T',howto=mean)

if it is possible then how to,and if not is there any suggestions.

ArrayIndexOutOfBoundsException

We are getting ArrayIndexOutOfBoundsException when using ARIMA model with yearly seasonality (Timeperiod.oneYear()). Is there any minimum number of points for training?
(we tried with many different sets of data and always ends up with this error) Here is the stack trace

java.lang.ArrayIndexOutOfBoundsException: -1809947671

at com.github.signaflo.timeseries.model.arima.ArimaKalmanFilter.inclu2(ArimaKalmanFilter.java:331)
at com.github.signaflo.timeseries.model.arima.ArimaKalmanFilter.getInitialStateCovariance(ArimaKalmanFilter.java:273)
at com.github.signaflo.timeseries.model.arima.ArimaKalmanFilter.initializePredictedCovariance(ArimaKalmanFilter.java:167)
at com.github.signaflo.timeseries.model.arima.ArimaKalmanFilter.<init>(ArimaKalmanFilter.java:68)
at com.github.signaflo.timeseries.model.arima.ArimaModel.kalmanFit(ArimaModel.java:288)
at com.github.signaflo.timeseries.model.arima.ArimaModel.access$500(ArimaModel.java:61)
at com.github.signaflo.timeseries.model.arima.ArimaModel$OptimFunction.at(ArimaModel.java:685)
at com.github.signaflo.math.optim.BFGS.<init>(BFGS.java:85)
at com.github.signaflo.timeseries.model.arima.ArimaModel.<init>(ArimaModel.java:126)
at com.github.signaflo.timeseries.model.arima.ArimaModel.<init>(ArimaModel.java:80)
at com.github.signaflo.timeseries.model.arima.Arima.model(Arima.java:64)
at com.yahoo.digits.druid.forecastquery.model.ArimaModel.train(ArimaModel.java:102)
at com.yahoo.digits.druid.forecastquery.model.ArimaModelTest.testArimaModel(ArimaModelTest.java:43)

Consider replacing OffsetDateTime with library Time class

OffsetDateTime is a critical but annoying piece of the library. I don't want external users of the library to have to mess with it, though there may be good reasons for keeping it around.

I would prefer to wrap it in a simple Time class that delegates the small chunk of behavior we need from it. If anything, we could get a much prettier toString representation and make it much easier to create an instance of the class.

Attention: there might be some mistake in the code.

Hi, thanks for your java-timeseries working! I'm reading this open source recently. I found that there might be some mistake in the code. Please see the class ArimaCoefficients.

There may be errors in the calculation process of the methods expandArCoefficients(...) and
expandMaCoefficients(...).

For example,

double[] arCoeffs = new double[] {1, 3, 5};
double[] sarCoeffs = new double[] {2};
int seasonalFrequency = 2;

double[] expandArcoeffs = new double[arCoeffs.length + sarCoeffs.length * seasonalFrequency]; 
expandArcoeffs = expandArCoefficients(arCoeffs, sarCoeffs, seasonalFrequency);

By the method expandArCoefficients(...), the array expandArcoeffs is euqal to
{1.0, 2.0, -2.0, -6.0, -10.0}.
In fact, through simple polynomial operations, the array expandArcoeffs should be equal to
{1.0, 5.0, 3.0, -6.0, -10.0}.

Furthermore, if arCoeffs.length >= seasonalFrequency, then the method expandArCoefficients(...) may result in wrong results.
And if maCoeffs.length >= seasonalFrequency, then the method expandMaCoefficients(...) may result in wrong results.

In my opinion, the correct code may be as follows.

static double[] expandArCoefficients(final double[] arCoeffs, final double[] sarCoeffs,
                                         final int seasonalFrequency) {
        double[] arC = new double[arCoeffs.length+1];
        double[] sarC = new double[sarCoeffs.length+1];
        double[] arSarCoeffs = new double[arCoeffs.length + sarCoeffs.length * seasonalFrequency];
        double[] arSarC = new double[arSarCoeffs.length+1];

        arC[0] = -1.0;
        sarC[0] = -1.0;
        System.arraycopy(arCoeffs, 0, arC, 1, arCoeffs.length);
        System.arraycopy(sarCoeffs, 0, sarC, 1, sarCoeffs.length);

        // Note that we take into account the interaction between the seasonal and non-seasonal coefficients,
        // which arises because the model's ar and seasonal ar polynomials are multiplied together.
        for (int i = 0; i < arC.length; i++) {
            for (int j = 0; j < sarC.length; j++) {
                arSarC[i + j* seasonalFrequency] += -arC[i] * sarC[j];
            }
        }
        System.arraycopy(arSarC, 1, arSarCoeffs, 0, arSarCoeffs.length);
        return arSarCoeffs;
    }

    // Expand the moving average coefficients by combining the non-seasonal and seasonal coefficients into a single
    // array, which takes advantage of the fact that a seasonal MA model is a special case of a non-seasonal
    // MA model with zero coefficients at the non-seasonal indices.
    static double[] expandMaCoefficients(final double[] maCoeffs, final double[] smaCoeffs,
                                         final int seasonalFrequency) {

        double[] maC = new double[maCoeffs.length+1];
        double[] smaC = new double[smaCoeffs.length+1];
        double[] maSmaCoeffs = new double[maCoeffs.length + smaCoeffs.length * seasonalFrequency];
        double[] maSmaC = new double[maSmaCoeffs.length+1];

        maC[0] = 1.0;
        smaC[0] = 1.0;
        System.arraycopy(maCoeffs, 0, maC, 1, maCoeffs.length);
        System.arraycopy(smaCoeffs, 0, smaC, 1, smaCoeffs.length);

        // Note that we take into account the interaction between the seasonal and non-seasonal coefficients,
        // which arises because the model's ar and seasonal ar polynomials are multiplied together.
        for (int i = 0; i < maC.length; i++) {
            for (int j = 0; j < smaC.length; j++) {
                maSmaC[i + j * seasonalFrequency] += maC[i] * smaC[j];
            }
        }
        System.arraycopy(maSmaC, 1, maSmaCoeffs, 0, maSmaCoeffs.length);
        return maSmaCoeffs;
    }

I hope I misunderstood your code.

Please forgive my poor English and code.
Thank you again for your open source. It's excellent.

Best wishes!

ADF CHECK

I think there should be a ADF check for the time series before starting the ARIMA process.
Does this lib support this function?

How to calculate AIC

Hi, @signaflo !
Sorry to bother you.
When I try to use your code, I get some problems and I sincerely ask for your help. In fact ,I don't know how you calculate the AIC, are there some docs to help me better undetstand your code?
I will appreciate your reply if you can spare some time to help me.
Good Day!

Loop in BFGS.java#127 can fall into infinite loop.

See discussion in PR #3

Code

while (!(Double.isFinite(functionValue) &&
                           functionValue < priorFunctionValue + C1 * stepSize * slopeAt0) && !stop) {

can fall into infinite loop.

Many errors upon initial build

I don't think this is supposed to happen; when I try to build java-timeseries-master after forking and downloading it, I get a ton of build errors before I've actually touched anything.

Hello Signaflo, For one weekly time series it is taking (18-20)sec to compute the results. Is there any way to optimize the computation time.

Convert non-invertible MA to invertible MA during parameter estimation

Any non-invertible MA model can be converted to an invertible one, except when the roots lie exactly on the unit circle. The algorithm for doing so first needs to be discovered and explained, and then implemented.

No issue

Define the concept of an ArimaProcess in the context of a streaming flow of data

In the Streaming branch, we're using a StreamingSeries that implements the Flow.Processor interface. The idea is to be able to have parallel observation of this series by multiple different time series models that can each be updated upon the receipt of a new series observation.

I feel that the easiest way down this path is to not focus on model estimation yet, but to start with an ARIMA process with known coefficients to see in what ways this process changes as new data comes in.

Testing Problem:the prediction results might be incorrect

@signaflo Hi
Sorry to bother you.
I run a Main.java according to your Wiki, the program can run.However, after I have checked the results, I found something strange: the prediction results after 2013-04-01 are all too small compared to before, so I want to assure if there is some problem in my Main.java.Below I will paste my Main.java and results so that you can better check.

Here is the Main.java:
public class Main2 {

public static void main(String[] args){
    TimeSeries timeSeries = TestData.debitcards;

    ArimaOrder modelOrder = ArimaOrder.order(2, 4, 6, 0, 1, 1);

    Arima model = Arima.model(timeSeries, modelOrder);

    System.out.println(model.aic()); // Get and display the model AIC

    Forecast forecast = model.forecast(12);

    System.out.println(forecast);
}

}

Here is result:
`| Date | Forecast | Lower 95.0% | Upper 95.0% |

| 2013-01-01T00:00 | 144328.7145 | 144222.6794 | 144434.7496 |
| 2013-02-01T00:00 | 890130.1238 | 889636.4510 | 890623.7965 |
| 2013-03-01T00:00 | 3562769.397 | 3561322.414 | 3564216.379 |
| 2013-04-01T00:00 | 1.095551038 | 1.095211425 | 1.095890651 |
| 2013-05-01T00:00 | 2.833212465 | 2.832518799 | 2.833906131 |
| 2013-06-01T00:00 | 6.473400261 | 6.472112782 | 6.474687739 |
| 2013-07-01T00:00 | 1.347106513 | 1.346883862 | 1.347329164 |
| 2013-08-01T00:00 | 2.604723814 | 2.604359301 | 2.605088327 |
| 2013-09-01T00:00 | 4.745488526 | 4.744917359 | 4.746059693 |
| 2013-10-01T00:00 | 8.230283109 | 8.229419808 | 8.231146410 |
| 2013-11-01T00:00 | 1.369331700 | 1.369205099 | 1.369458301 |
| 2013-12-01T00:00 | 2.198720933 | 2.198540000 | 2.198901866 |
`

computation time and memory issue

hey jacob,

when i'm forecasting weekly timeseries , it's taking a lot time to compute with ML strategy of using BFGS optimzer, i read that L-BFGS can slove this memory and computation time, is it possible ? if it is possible , the output results will be same or not?

If I want the predict values all to be greater than 0,how can i do

day time predict

double[] series = new double[]{1, 46, 8, 3, 4, 6, 9, 2, 16, 3};
TimeSeries timeSeries = TimeSeries.from(TimePeriod.oneDay(), series);

The product of lag and times must be less than or equal to the length of the series, but 1 * 365 = 365 is greater than 9

Decide what Type of item a StreamingSeries should emit to subscribers

Should the StreamingSeries emit a Double, an Observation, or an entire static TimeSeries?

Allow simulation of possibly infinite time series

Hi,
I use your library in a streaming environment to generate an ARIMA based time series. The simulated time series is supposed to be very long (possibly infinite).

Unfortunately, the whole series is simulated in advance and requires an array of size n. It would be nice to have the possibility to get an "iterator" based time series that allows to retrieve he latest Y_t only, which is then calculated on request. Space complexity should then reduce to max(p,q), right?

Is that something you consider as useful and would possibly implement?

Add Box-Cox transformation parameter to ARIMA models and forecasts

The ARIMA models should be able to automatically apply a supplied one parameter Box-Cox transformation when fitting the model.

When the model is forecast, the Box-Cox transformation should be reversed in order to get forecasts on the scale of the original data.

For basic use cases and explanation, see here: https://www.otexts.org/fpp/2/4

how to get the best p d q

Hi, @signaflo !
Sorry to bother you.
i want to ask if there any way to get the best p d q or list of acf、pacf 。
I will appreciate your reply if you can spare some time to help me.
Good Day!

RMSE calculation

xt1.txt
Hello Signaflo,
I have the attached 1 year data. After prediction, when I calculate RMSE, I get values above 10. Can you please guide me.

如果希望预测值全都大于0 该如何处理数据呢

Why are getters in ArimaCoefficients package-private?

It would be nice to be able to inspect the ARIMA coefficients after fitting. Unfortunately, the getter methods in the above mentioned class are package-private. Is there a reason for that?

NPE

Got NPE on simple code:

public static void main(String[] args) {
        ArimaOrder arimaOrder = ArimaOrder.order(0, 1, 1, Arima.Constant.INCLUDE);
        Arima model = Arima.model(TimeSeries.from(new double[]{
            100.1,
            200.4,
            300.2,
            400.5,
            500.1,
            600.7,
            700.7,
            800.7
        }), arimaOrder);
        Forecast forecast = model.forecast(5);
        System.out.println(forecast);
    }

After searching:
https://github.com/signaflo/java-timeseries/blob/master/timeseries/src/main/java/com/github/signaflo/data/regression/MultipleLinearRegressionModel.java#L254

You don't check the return boolean value, that can be false...

Add Coefficient and Parameter concepts

The current way of using coefficients is ugly.

A coefficient is either a known or estimated property of a process or model. It should have un uncertainty score associated with it. If the coefficient is pre-determined, the uncertainty score should be zero. Otherwise, the coefficient should be greater than zero.

A parameter is a numeric property of a process or model that is unknown and may vary. Once it no becomes known or no longer varies, it becomes a coefficient.

Hannan-Rissanen algorithm

This repo: https://github.com/Workday/timeseries-forecast says it is an implementation of the Hannan-Rissanen algorithm for additive ARIMA models.

I was wondering what algorithm this repo uses?

How do I run a seasonal ARIMA model?

I know my (p,d,q)(P,D,Q)_M (seven) coefficients, where should I put them? This library has only 6 coefficients as far as I could see.

This library has a nice sample code in README: https://github.com/Workday/timeseries-forecast

The TimeUnit enum may be pointless

The TimeUnit enum is basically a wrapper around Java's native ChronoUnit enum, but it adds the concept of a Quarter and has two methods "frequencyPer(other time unit)" and "totalDuration()".

I'm beginning to think all of this should be moved to the TimePeriod class. The TimePeriod class should hold a reference to a TemporalUnit instead of a TimeUnit.

The TimePeriod class also has a frequencyPer(other time period) and totalSeconds() [Note that the totalDuration method in TimeUnit returns the duration in seconds, so its the same thing]. So at this point it appears the TimeUnit enum is basically just adding an unnecessary layer of redirection.

I'm going to remove it unless I get a compelling argument to keep it.

Can't access ArimaCoefficients outside of the package

@signaflo I am trying to build Arima models separating the training and test phase. I cannot access the fields from the ArimaCoefficients class outside of the java-timeseries package in order to store them and use them later. Is it possible for you to change that ?

Thanks in advance

Would it be possible to cite this repository?

I used the latest release in my work to produce results that I plan to publish. Would it be possible to assign a DOI to this repository?
Thanks

Here is how to use this library - Quickstart Tutorial

Download the latest version of eclipse for java.
Download the zip file of project, unzip, import in eclipse as gradle project. It will download many jar files, and after the process finishes, there will be approx 25 errors. To fix those errors follow the next step.
Download the library lombok.jar from the URL: https://projectlombok.org/download and place it in the eclipse folder.
Go to command, navigate to the eclipse directory, and give the following command:
Drive:\eclipse>java -jar lombok-x.xx.jar (mention exact name of lombok jar file)
Click on install/update button. Exit.
Exit and Restart eclipse.
Select the project folder and click on clean. This will rebuild the project and errors will be gone.
To use the library, create a new java project in eclipse, and go the java build path of project, and select the tab "Projects". Click on add, and select the three existing projects: The project of this library, the project with name "math" and the project with name "time-series"
To test the new project, create a new java class with Main function and use the following code in it:

`package testProj;

import com.github.signaflo.timeseries.TestData;
import com.github.signaflo.timeseries.TimeSeries;
import com.github.signaflo.timeseries.forecast.Forecast;
import com.github.signaflo.timeseries.model.Model;
import com.github.signaflo.timeseries.model.arima.Arima;
import com.github.signaflo.timeseries.model.arima.ArimaOrder;

public class MainFile {

public static void main(String[] args) 
{
	// TODO Auto-generated method stub
	
			TimeSeries timeSeries = TestData.debitcards;
			ArimaOrder modelOrder = ArimaOrder.order(0, 1, 1, 0, 1, 1); // Note that intercept fitting will automatically be turned off
			Arima model = Arima.model(timeSeries, modelOrder);
			Forecast forecast = model.forecast(1); // To specify the alpha significance level, add it as a second argument.
			
			System.out.println(forecast);
			
			System.out.println(forecast.pointEstimates().mean());

}

}
`

Something is wrong

When I run java-timeseries with the following data, something is wrong.

double[] sales = new double[] {3.0, 3.0, 7.0, 2.0, 2.0, 1.0, 0.0, 3.0, 4.0, 3.0, 2.0, 3.0, 6.0, 1.0, 
                                                0.0, 3.0, 4.0, 2.0, 2.0, 0.0, 1.0};
long season = 8l;

TimePeriod   day = TimePeriod.oneDay();
TimeSeries   series = TimeSeries.from(day, sales);
TimePeriod   timePeriod = new TimePeriod(TimeUnit.DAY, season);
ArimaOrder   order = ArimaOrder.order(3, 1, 2, 1, 1, 2);
Arima             model = Arima.model(series, order, timePeriod);

Forecast forecast = model.forecast(7);
TimeSeries forecastValue = forecast.pointEstimates();
double[] forecastValuesArray = forecastValue.asArray();

 for ( double  forecastValue : forecastValuesArray) {
                System.out.println(forecastValue);
            }

Then the printed results are all NaN.

Thanks！

signaflo / java-timeseries Goto Github PK

java-timeseries's People

Contributors

Stargazers

Watchers

Forkers

java-timeseries's Issues

Here is result: `| Date | Forecast | Lower 95.0% | Upper 95.0% |

Recommend Projects

Recommend Topics

Recommend Org

Jobs

Here is result:
`| Date | Forecast | Lower 95.0% | Upper 95.0% |