pyfts / pyfts Goto Github PK

An open source library for Fuzzy Time Series in Python

Home Page: http://pyfts.github.io/pyFTS/

License: GNU General Public License v3.0

Python 99.80% Makefile 0.09% Batchfile 0.11%

fuzzy-sets fts time-series forecasting forecasting-models fuzzy-time-series probabilistic-forecasting data-science interval time-series-analysis

pyfts's Introduction

pyFTS - Fuzzy Time Series for Python

What is pyFTS Library?

This package is intended for students, researchers, data scientists or whose want to exploit the Fuzzy Time Series methods. These methods provide simple, easy to use, computationally cheap and human-readable models, suitable for statistic laymans to experts.

This project is continously under improvement and contributors are well come.

How to reference pyFTS?

Silva, P. C. L. et al. pyFTS: Fuzzy Time Series for Python. Belo Horizonte. 2018. DOI: 10.5281/zenodo.597359. Url: http://doi.org/10.5281/zenodo.597359

How to install pyFTS?

First of all pyFTS was developed and tested with Python 3.6. To install pyFTS using pip tool

pip install -U pyFTS

Ou pull directly from the GitHub repo:

pip install -U git+https://github.com/PYFTS/pyFTS

What are Fuzzy Time Series (FTS)?

Fuzzy Time Series (FTS) are non parametric methods for time series forecasting based on Fuzzy Theory. The original method was proposed by [1] and improved later by many researchers. The general approach of the FTS methods, based on [2] is listed below:

Data preprocessing: Data transformation functions contained at pyFTS.common.Transformations, like differentiation, Box-Cox, scaling and normalization.
Universe of Discourse Partitioning: This is the most important step. Here, the range of values of the numerical time series Y(t) will be splited in overlapped intervals and for each interval will be created a Fuzzy Set. This step is performed by pyFTS.partition module and its classes (for instance GridPartitioner, EntropyPartitioner, etc). The main parameters are:

the number of intervals
which fuzzy membership function (on pyFTS.common.Membership)
partition scheme (GridPartitioner, EntropyPartitioner[3], FCMPartitioner, CMeansPartitioner, HuarngPartitioner[4])

Check out the jupyter notebook on notebooks/Partitioners.ipynb for sample codes.

Data Fuzzyfication: Each data point of the numerical time series Y(t) will be translated to a fuzzy representation (usually one or more fuzzy sets), and then a fuzzy time series F(t) is created.
Generation of Fuzzy Rules: In this step the temporal transition rules are created. These rules depends on the method and their characteristics:

order: the number of time lags used on forecasting
weights: the weighted models introduce weights on fuzzy rules for smoothing [5],[6],[7]
seasonality: seasonality models depends [8]
steps ahead: the number of steps ahed to predict. Almost all standard methods are based on one-step-ahead forecasting
forecasting type: Almost all standard methods are point-based, but pyFTS also provides intervalar and probabilistic forecasting methods.

Forecasting: The forecasting step takes a sample (with minimum length equal to the model's order) and generate a fuzzy outputs (fuzzy set(s)) for the next time ahead.
Defuzzyfication: This step transform the fuzzy forecast into a real number.
Data postprocessing: The inverse operations of step 1.

Usage examples

There is nothing better than good code examples to start. Then check out the demo Jupyter Notebooks of the implemented method os pyFTS!.

A Google Colab example can also be found here.

MINDS - Machine Intelligence And Data Science Lab

This tool is result of collective effort of MINDS Lab, headed by Prof. Frederico Gadelha Guimaraes. Some of research on FTS which was developed under pyFTS:

2020
- ORANG, Omid; Solar Energy Forecasting With Fuzzy Time Series Using High-Order Fuzzy Cognitive Maps. IEEE World Congress On Computational Intelligence 2020 (WCCI).
- ALYOUSIFI, Y; FAYE, Othman M; SOKKALINGAM, I; SILVA, P. Markov Weighted Fuzzy Time-Series Model Based on an Optimum Partition Method for Forecasting Air Pollution. International Journal of Fuzzy Systems, 2020. http://doi.org/10.1007/s40815-020-00841-w
- SILVA, Petrônio CL et al. Forecasting in Non-stationary Environments with Fuzzy Time Series. https://arxiv.org/abs/2004.12554
- SILVA, Petrônio CL et al. Distributed Evolutionary Hyperparameter Optimization for Fuzzy Time Series. IEEE Transactions on Network and Service Management, 2020. http://doi.org/10.1109/TNSM.2020.2980289
- ALYOUSIFI, Yousif et al. Predicting Daily Air Pollution Index Based on Fuzzy Time Series Markov Chain Model. Symmetry, v. 12, n. 2, p. 293, 2020. http://doi.org/10.3390/sym12020293
2019
- SILVA, Petrônio C. L. Scalable Models of Fuzzy Time Series for Probabilistic Forecasting. PhD Thesis. https://doi.org/10.5281/zenodo.3374641
- SADAEI, Hossein J. et al. Short-term load forecasting by using a combined method of convolutional neural networks and fuzzy time series. Energy, v. 175, p. 365-377, 2019. http://doi.org/10.1016/j.energy.2019.03.081
- SILVA, Petrônio CL et al. Probabilistic forecasting with fuzzy time series. IEEE Transactions on Fuzzy Systems, 2019. http://doi.org/10.1109/TFUZZ.2019.2922152
- SILVA, Petrônio C. L.; LUCAS, Patrícia de O.; GUIMARÃES, Frederico Gadelha. A Distributed Algorithm for Scalable Fuzzy Time Series. In: International Conference on Green, Pervasive, and Cloud Computing. Springer, Cham, 2019. p. 42-56. http://doi.org/10.1007/978-3-030-19223-5_4
- SILVA, Petrônio Cândido de Lima et al. A New Granular Approach for Multivariate Forecasting. In: Latin American Workshop on Computational Neuroscience. Springer, Cham, 2019. p. 41-58. http://doi.org/10.1007/978-3-030-36636-0_4
- ALVES, Marcos Antonio et al. Otimizaçao Dinâmica Evolucionária para Despacho de Energia em uma Microrrede usando Veıculos Elétricos. Em: Anais do 14º Simpósio Brasileiro de Automação Inteligente. Campinas : GALOÁ. 2019. http://doi.org/10.17648/sbai-2019-111524
- LUCAS, Patrícia de O.; SILVA, Petrônio C. L.; GUIMARAES, Frederico G. Otimização Evolutiva de Hiperparâmetros para Modelos de Séries Temporais Nebulosas.Em: Anais do 14º Simpósio Brasileiro de Automação Inteligente. Campinas : GALOÁ. 2019. http://doi.org/10.17648/sbai-2019-111141
2018
- ALVES, Marcos Antônio et al. An extension of nonstationary fuzzy sets to heteroskedastic fuzzy time series. In: ESANN. 2018.
2017
- SEVERIANO, Carlos A. et al. Very short-term solar forecasting using fuzzy time series. In: 2017 IEEE international conference on fuzzy systems (FUZZ-IEEE). IEEE, 2017. p. 1-6. http://doi.org/10.1109/FUZZ-IEEE.2017.8015732
- SILVA, Petrônio C. L.; et al. Probabilistic forecasting with seasonal ensemble fuzzy time-series. In: XIII Brazilian Congress on Computational Intelligence, Rio de Janeiro. 2017. http://doi.org/10.21528/CBIC2017-54
- COSTA, Francirley R. B.; SILVA, Petrônio C. L.; GUIMARAES, Frederico G. REGRESSÃO LINEAR APLICADA NA PREDIÇÃO DE SERIES TEMPORAIS FUZZY. Simpósio Brasileiro de Automação Inteligente (SBAI), 2017.
2016
- SILVA, Petrônio C. L.; SADAEI, Hossein Javedani; GUIMARAES, Frederico G. Interval forecasting with fuzzy time series. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 2016. p. 1-8. http://doi.org/10.1109/SSCI.2016.7850010

References

Q. Song and B. S. Chissom, “Fuzzy time series and its models,” Fuzzy Sets Syst., vol. 54, no. 3, pp. 269–277, 1993.
S.-M. Chen, “Forecasting enrollments based on fuzzy time series,” Fuzzy Sets Syst., vol. 81, no. 3, pp. 311–319, 1996.
C. H. Cheng, R. J. Chang, and C. A. Yeh, “Entropy-based and trapezoidal fuzzification-based fuzzy time series approach for forecasting IT project cost”. Technol. Forecast. Social Change, vol. 73, no. 5, pp. 524–542, Jun. 2006.
K. H. Huarng, “Effective lengths of intervals to improve forecasting in fuzzy time series”. Fuzzy Sets Syst., vol. 123, no. 3, pp. 387–394, Nov. 2001.
H.-K. Yu, “Weighted fuzzy time series models for TAIEX forecasting”. Phys. A Stat. Mech. its Appl., vol. 349, no. 3, pp. 609–624, 2005.
R. Efendi, Z. Ismail, and M. M. Deris, “Improved weight Fuzzy Time Series as used in the exchange rates forecasting of US Dollar to Ringgit Malaysia,” Int. J. Comput. Intell. Appl., vol. 12, no. 1, p. 1350005, 2013.
H. J. Sadaei, R. Enayatifar, A. H. Abdullah, and A. Gani, “Short-term load forecasting using a hybrid model with a refined exponentially weighted fuzzy time series and an improved harmony search,” Int. J. Electr. Power Energy Syst., vol. 62, no. from 2005, pp. 118–129, 2014.
C.-H. Cheng, Y.-S. Chen, and Y.-L. Wu, “Forecasting innovation diffusion of products using trend-weighted fuzzy time-series model,” Expert Syst. Appl., vol. 36, no. 2, pp. 1826–1832, 2009.

pyfts's People

Contributors

Stargazers

Watchers

pyfts's Issues

Constructor of FTS functions:

Example:

model = chen.ConventionalFTS(partitioner=fs)

Returns the following statement:

TypeError: init() missing 1 required positional argument: 'name'

adjust number of lags

It is mentioned in your tutorial, A short tutorial on Fuzzy Time Series, the lag used in the example used t-1 lag, how do we adjust the number of lags using pyFTS?

Regards,

Look ahead bias in performance measure?

I tried run the notebook Chen - ConventionalFTS.ipynb and saw the good results, but I am bit skeptical it is so good... so follow the logic of bchmk.sliding_window_benchmarks code, as well as the plots you are showing, I feel you might comparing predicted T+1 timeseries with current T timeseries,

Be more specific: given test data length L, T[0,1,...L], the model.predict produce a T+1 value for each value supplied. the prediction has same length as the given test data.

However, when you plot, or check the performance, you can not directly compare testdata vs predicted. e.g. in your code you compute Measures.py line 396, rmse(datum, forecast)

The correct measure might should be rmse(dataum[1:], forecast[:-1])

Also for you notebook plot, if you shift the prediction with -1 steps, u will see different plot. It will be similar to most of the timeseries models that show prediction = lag(1) + noise which we hope to overcome.

Let me know if I might misunderstood the code/loigc ...

BTW, nice work. I am still trying it out, hope can prove that I could use it in my project...

error on "Silva,_Sadaei,_Guimaraes_ProbabilisticWeightedFTS.ipynb" file

I tried running the file under the section "Partitioning optimization" but I encountered an error.
Also I cannot read the file "pwfts_taiex_partitioning.csv".

PEP 8 compliance

At this moment all the code is non compliant with PEP 8 Style Guide (https://www.python.org/dev/peps/pep-0008/). It is necessary a global class, variable and method renaming to be compliant.

Error

hello
i'm running the Song - ConventionalFTS.ipynb file and i keep facing this error :

2018-08-07 20:14:53 pycos - version 4.7.7 with IOCP I/O notifier
2018-08-07 20:14:53 dispy - dispy client version: 4.9.1
2018-08-07 20:14:53 dispy - Storing fault recovery information in "_dispy_20180807201453"
2018-08-07 20:14:53 dispy - dispy client at 192.168.1.100:51347
2018-08-07 20:14:53 dispy - Started HTTP server at ('0.0.0.0', 8181)

what should i do with it ?

Plot of Forecasted vs Actual misrepresenting the fit by not inserting None at right index?

Hello Petronio, I noticed that Chen's conventional result looks “suspiciously” good as @wangtieqiao shared #25 . It appears that the plot is comparing the predicted t+1 values with the current time series T.
Interestingly, I encountered a similar issue while working with the library and a pwfts model of order 1. My validation plots (see image below) shows a great fit of my predicted values on the actual values (too great id say..) but the RMSE calculated using Measures.get_point_statistics turned out to be unexpectedly high at 2.32 compare to my other fit.

Here is my code:

def Cash_in(train_set, valid_set):
    rows = []
    fig, ax = plt.subplots(nrows=1, ncols=1, figsize=[12, 8])
    y_val = pd.Series(valid_set['Scale_Montant'])
    y_train = pd.Series(train_set['Scale_Montant'])
    ax.plot(y_val.values, label ='Validation',color='black')
    for method in [pwfts.ProbabilisticWeightedFTS]:
        for partitions in [Grid.GridPartitioner]:
            for npart in [4]:
                for order in [1]:
                    part = partitions(data=y_train.values, npart=npart, transformation=diff)
                    model = method(order=order, partitioner=part)
                    model.append_transformation(diff)
                    model.name = model.shortname + str(partitions).replace('>', '').replace('<', '').replace('class', '') +str(npart) + str(order)
                    model.fit(y_train.values)
                    # Validation forecast    
                    forecasted_values_valid = model.predict(y_val.values)

                    # Plot the fitted values of the train set against the actual
                    ax.plot(np.array(forecasted_values_valid), label =  str(model.shortname) + str(partitions) + str(npart) + " partitions" + str(order)+ ' order', color= 'blue')
                    ax.set_title('Validation')            

                    # Performance measure on the validation set 
                    rmse_v, mape_v, u_v = Measures.get_point_statistics(y_val.values, model)
                    rows.append([model.shortname, str(partitions).replace('>', '').replace('<', '').replace('class', ''), npart, order, rmse_v, mape_v, u_v])
                        
                    handles, labels = ax.get_legend_handles_labels()
                    lgd = ax.legend(handles, labels, loc=1, bbox_to_anchor=(1, 1))

                    plt.show()

    result_cash_in = pd.DataFrame(rows, columns=['Model', 'partitions_techniques','#_partitions', 'order','RMSE_Valid', 'MAPE_Valid', 'U_Valid'])
    pd.set_option('max_colwidth', None)
    return result_cash_in, forecasted_values_valid

Weekly_Cash_in_models, forecasts_df_valid= Cash_in(train_set, valid_set)

To investigate further, I manually computed the forecasted value on my validation set using this formula coming from #25:

`mse = np.mean((Forecast_valid[:-1] - y_valid[model.order:])**2) 
rmse = np.sqrt(mse)`

Surprisingly, when computing it by hand using my forecasted array as is, it gave me a different RMSE of 1.04. Trying to figure out what was going on, I decided to add a none to the first observations of my forecast using:

                    for k in np.arange(order):
                        forecasted_values_valid.insert(0,None)

which effectively shifted the forecast array one position to the right. After doing this, I recalculated the RMSE and got a value of 2.34, much close to the 2.32 using get_point_statistic.
It turns out that the issue was caused by me not inserting none in the forecast array at the index [model.order:] (in my case, at index = 1). I didn't insert none for orders lower than 1 because I was following an example from one of your notebooks: [Link to the notebook].

for order in np.arange(1,4):
  part = Grid.GridPartitioner(data=y, npart=10)
  model = hofts.HighOrderFTS(order=order, partitioner=part)
  model.fit(y)
  forecasts = model.predict(y)
  if order > 1:
    for k in np.arange(order):
      forecasts.insert(0,None)

I've noticed some "contradictory" information while going through various notebooks and the pyFTS tutorial ([Link to the tutorial](https://sbic.org.br/lnlm/wp-content/uploads/2021/12/vol19-no2-art3.pdf)). Some sources suggest that we need to insert none from [model.order:] even at order 1.
As I understand it, the parameter "order" represents the minimum number of lags used to predict the next observations. In the case of an order of 1, you can't have a predicted value at the first observation of the array since the first observation is used to predict the second one.

When examining one of the notebooks (picture below), it seems that not assigning none to the observations at positions [model.order:] causes the plots of fitted values to shift by 2 to the left. Thus rendering the graph invalid. I believe this could be what is happening in the Chen Conventional notebook.

Capture d’écran, le 2023-07-28 à 12 27 44

I would greatly appreciate some clarification on why some notebooks recommend assigning none at the position [model.order:], while others do not. It's a bit confusing, especially when different examples use different manipulations on the same model.

Thank you for your, and I'm looking forward to resolving this confusion of mine.

help on 'interval' and 'distribution, and residual analysis

I'd like to know how to interpret model result on 'interval' , 'distribution', and residual analysis
Regards,

Question about the get point statistics function in measurement

How to calculate rmse and mape

Hello,
I want to calculate RMSE and MAPE (not SMAPE) using Measures. At the same time, I have 3 methods, each with 1, 2 and 3 order. How can I calculate those errors ? I cannot do that because forecasts have None values where the count of None is depending of order. I tried to sync data but it doesn't work for all methods. For example:
test - [104.9 105.7 116.4 106.5 100.4]
forecasts - [None, 102.19000000000001, 102.19000000000001, 112.00750000000002, 102.19000000000001]

so, I tried to delete first values to get rid of None value, however it works only for HighOrderFTS and WeightedHighOrderFTS method.

In case of ProbabilisticWeightedFTS method I noticed that forecasts have one more value:

Probabilistic FTS:

test - [104.9 105.7 116.4 106.5 100.4]
forecasts - [None, 112.01067654320988, 112.73816977752367, 115.4650413689881, 113.23877302970163, 99.0]

How to deal with that ?

Unable to get data

I'm trying to run the example, but I'm running DF= Enrollments.get_ dataframe()
I met this sentence urllib.error.URLError : <urlopen error [Errno 11004] getaddrinfo failed>
Thank you very much.

Defining the membership function

Hello,
I am performing a fuzzy time series analysis and I am trying to use grid partitioning with trapezoidal mf, however, when I ran:
fs = Grid.GridPartitioner(data = ts, npart = 20, mf = mf.trapmf)
I am still getting a triangle fuzzy set, how should I define the mf in the grid partitioning?

Partitioner.fuzzyfy cannot handle parameters mode='vector' and method='maximum'

I invoke method as follows:

cpu.partitioner.fuzzyfy(data_point, mode='sets', method='maximum', alpha_cut=0.0)

Then, the output as follows:

['cpu.busy0', 'cpu.busy0', 'cpu.busy1', 'cpu.busy1', 'cpu.busy1', 'cpu.busy1']

But, I only want their values.

Question on generators parameter

Hi,

I'd like to ask how to use generators parameter for multivariate forecasting. I have read your tutorial: https://towardsdatascience.com/a-short-tutorial-on-fuzzy-time-series-part-ii-with-an-case-study-on-solar-energy-bda362ecca6d and you used date as exogenous data. Specifically my question is if I have two exogenous data for example wind and temperature, what will be my generator? My question may be vague, I'm sorry. Great library, thank you for updating.

[MAJOR ISSUE] Error in the Calculation of Inverse Transformations for model.predict()

Hi Petronio, I'm Rein, Master of AI student.

I wanted to use the PyFTS module to perform a specific subset of my project.

However, from the FTS class, under the predict method, you have the following line at the end of the method used to apply the inverse of the transformation applied to the data during forecasting:

One thing I noticed here is that you fed the test data itself into the params argument of the apply_inverse_transformations() function.

since, with a max_lag of 1 (first order differencing), data[self.max_lag - 1:] is equal to data.

if not self.is_multivariate:
            kw['type'] = type
            ret = self.apply_inverse_transformations(ret, params=[data[self.max_lag - 1:]], **kw)

But then, following the downstream operations, when I looked into the Differential class (/transformations/differential.py), I noticed that the default calculation used to perform the inverse transformation is the following:

if steps_ahead == 1:
            if type == "point":
                inc = [data[t] + param[t] for t in np.arange(0, n)]

where data here is the forecasted data y'(t), param is the test data itself, and inc is supposed to be the output inverse-transformed forecast.

I would like to ask clarification regarding the correctness of this logic. The value range of data is very small, hence, what happens is that the output inverse-transformed forecast is primarily dominated by the value of the test data itself. That is, if in the cited codeblock above, we remove data[t] from [data[t] + param[t] for t in np.arange(0, n)], we would still get a "convincing forecast" because you're practically returning the ground truth values as 'forecasted values'.

This figure comes from the A_short_tutorial_on_Fuzzy_Time_Series _ Part_II_(with_an_case_study_on_Solar Energy) notebook that uses the Chen model. As shown, the forecasted data (in green) is able to capture the test data almost exactly because the test data itself was injected into the output forecast logic.

Here is my attempt of replicating the logic with my own data, using the same model.

Without using the training data for forecasts, and doing a 365 step_ahead forecast using only the starting test data point as an input, I get the following forecast for the same data as above:

Is there something I'm missing? I think the bug above is crucial with respect to the entire logic of the fts model.

Error on plotting

I use matplotlib for plotting, but somehow I get an error whenever I import pyFTS and plot just a simple graph (even though I'm not using pyFTS). Below is the error:
UserWarning: findfont: Font family ['sans-serif'] not found. Falling back to DejaVu Sans (prop.get_family(), self.defaultFamily[fontext]))

I don't know what is happening,I already updated my fonts. And by checking the codes pyFTS also use matplotlib in the following format

import matplotlib as plt
import matplotlib.pyplot as plt

and some codes import matplotlib as

import matplotlib.pylab as plt

I'm using
matplotlib 2.2.3
pyFTS 1.2.2
python 3.6.3
on Ubuntu

Regards

will it work for multivariate time series prediction both regression and classification

great code thanks
may you clarify :
will it work for multivariate time series prediction both regression and classification
1
where all values are continues values
2
or even will it work for multivariate time series where values are mixture of continues and categorical values
for example 2 dimensions have continues values and 3 dimensions are categorical values

color        weight     gender  height  age

1 black 56 m 160 34
2 white 77 f 170 54
3 yellow 87 m 167 43
4 white 55 m 198 72
5 white 88 f 176 32

Code documentation with PEP 257 compliance

Document all methods and classes according with PEP 257 Docstring conventions (https://www.python.org/dev/peps/pep-0257/)

Question on forecasting method

I'm new to fuzzy logic and I'd like know why the predict method requires the test set of the data?
Another question is does the method predict only one-step ahead?

Regards

Question about the predict in test set.

Hi, recently i am working on fuzzy time series. these code really help me a lot. Thanks.

But I have a question about the prediction in test data. In your example codes,
forecasts = model1.predict(dataset[train_split:train_split+1]), it turns out you are assuming it is okay to use the true data of the previous day. However, I think we can only use the previous prediction output as the next input to the model. and I wrote the codes below:
prev_forecasts = dataset[train_split-1:train_split]

for n in range(test_length):
     
    new_forecast = model1.predict(prev_forecasts[n:n+1])
    
    prev_forecasts = np.append(prev_forecasts,new_forecast)

#forecasts = model1.predict(dataset[train_split:train_split+1])
forecasts = prev_forecasts

Unfortunately, the prediction is almost a straight line.

In your implementation for the test data, I think the Naive Forecast will perform the best(since the model has the most recent true data to make prediction.).

And by the way, which folder does the method "predict" (model.predict ) belong to ?

Looking forward to your reply.!
Thank you!

Multi-variate high order FTS model

Hi Dr.

Is there any tutorial about high-order multi-variate Fuzzy time series model?

Thanks.

One step forecasting

Hello Petronio,

I am testing the fuzzy time series model to forecast one period ahead of a time series with order 12. I am starting with a basic model: using a 20-grid partitioner, mf = triangle, chen model.

My code is:
ts = datos['y'].dropna().to_numpy()
fs = Grid.GridPartitioner(data = ts, npart = 20)
model = chen.ConventionalFTS(partitioner = fs)
model.fit(ts, order = 15)
forecast = model.predict(ts)

I'm getting a forecast array which length = len(ts), I have checked other posts, and I realized that the result is the one-step ahead forecast for each period, i.e. forecast[0] is the prediction that compares with ts[1] , forecast[1] is the prediction that compares with ts[2]...

My question here is: how I am getting a result in the first 12 values, since my model's order is 12?

These are the first 12 values that I am getting:

[0.3270031616601807,
0.3270031616601807,
0.3270031616601807,
0.3270031616601807,
0.09897023952868034,
0.3270031616601807,
0.3270031616601807,
0.09897023952868034,
-0.28108463069048684,
-0.2810846306904869,
0.09897023952868034,
0.3270031616601807,
...]

Thanks for the help

FCMpatitioner

thanks for u sharing,i'd like to use FCMpatitioner for multi-var time series , but i don't know how to begin with u code as i'm not teh preogrammer ,could u give me a short example?^^

how to use mvts for prediction

Hi @petroniocandido I am trying to use mvts for my own data. it has strong seasonality and trend and additional information about holiday. I have several qustions :

is the model good to model series with seasonality and trend ? in my understanding, seasonality will be handled by seasonal.TimeGridPartitioner. but how about trend ? is there possibility to use difference transformation inside the module?
I don't understand of using generator in predict module. can we just pass the dataframe of test containing date columns and holiday flag ?

thank you

How the predict function works?

Case 1:
forecasts = model.predict([1020,3840,1920], steps_ahead=9)
Where [1020,3840,1920] is the last 3 data point of time series in "train_uv"
Case 2:
forecasts = model.predict(train_uv, steps_ahead=9)

Both are giving different forecast i.e 9 future data points.

How the predict function is working in these two cases above?

Non datetime generator for MVTS

Hey i read part 1 and part 2 of your tutorial on medium but i'm still struggling to understand how to generator concept would work for a multi variate time series whose variable is a non date time, do you have any example code on that.

One Step Forecasting PWFTS

Hey Petronio when i try to do step by step forecasting and re-fitting the model i am getting much different results then when i simply do model.predict(all_data) the step predictor is much more smooth (almost a linear slope) and the simple model.predict() method works quite well but i am afraid there might be some data leak as when i convert the regression results to binary classes like this yhat = [0 if preds[n]>preds[n+1] else 1 for n in range(len(preds)-1)] the accuracy score is 1.0 on test and train set, would you have any sanity checklist when dealing with FTS,

The model i am using is pwfts with the standard TAIEX data is the model supposed to be used this way this is my first encounter with FTS outside of a academic book

def step_predictor(train_data,test_data):
history = [n for n in train_data]
history = np.array(history)
preds = []
preds.append(model.forecast_ahead(history,steps=1))
for i in tqdm(range(len(test_data))):
history = np.append(history,test_data[i])
model.fit(history)
preds.append(model.forecast_ahead(history,steps=1))
return preds

*aditional info

So as i was playing around with it i can see that the predictions also depend on the size of the array that it's predicting why could that be ?

after 20 min :)
It might be due to the way the smaller datasets get partitioned

Can we predict the data out of the sample？

Dear author, can the fuzzy time series forecasting method predict future data (that is, divide the data into a training set and a test set, where the test set does not participate in building the model)? Many papers I have seen seem to only be able to train Set to fit.

Bug in binary search

Hi. Dr.,

I think there is a bug in FuzzySet.binary search,

elif midpoint <= 1:
return [0]
elif midpoint >= max_len:
return [max_len]
the first elif should be midpoint<1: return[0].
Because the index of fuzzy sets start from 0 and end at (max_len= len(Fuzzysets)-1)

`np.int` was a deprecated

I run my code that uses PyFTS as an essential library. However, I got this error:

/usr/local/lib/python3.10/dist-packages/pyFTS/partitioners/partitioner.py:238: UserWarning:
set_ticklabels() should only be used with a fixed number of ticks, i.e. after set_ticks() or using a FixedLocator.

Traceback (most recent call last):
File "/root/python/forecast/pyFTSexample/pyFTSExam.ipynb", line 78, in
model.fit(train)
File "/usr/local/lib/python3.10/dist-packages/pyFTS/common/fts.py", line 384, in fit
self.train(mdata, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pyFTS/models/yu.py", line 62, in train
tmpdata = self.partitioner.fuzzyfy(ndata, method='maximum', mode='sets')
File "/usr/local/lib/python3.10/dist-packages/pyFTS/partitioners/partitioner.py", line 144, in fuzzyfy
mv = self.fuzzyfy(inst, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pyFTS/partitioners/partitioner.py", line 157, in fuzzyfy
tmp = self[ix].membership(data)
File "/usr/local/lib/python3.10/dist-packages/pyFTS/partitioners/partitioner.py", line 286, in getitem
if isinstance(item, (int, np.int, np.int8, np.int16, np.int32, np.int64)):
File "/usr/local/lib/python3.10/dist-packages/numpy/init.py", line 324, in getattr
raise AttributeError(former_attrs[attr])
AttributeError: module 'numpy' has no attribute 'int'.
np.int was a deprecated alias for the builtin int. To avoid this error in existing code, use int by itself. Doing this will not modify any behavior and is safe. When replacing np.int, you may wish to use e.g. np.int64 or np.int32 to specify the precision. If you wish to review your current use, check the release note link for additional information.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'inf'?

Perhaps, the line

if isinstance(item, (int, np.int, np.int8, np.int16, np.int32, np.int64)):

should be:

if isinstance(item, (int, np.int8, np.int16, np.int32, np.int64)):

Does anyone have other solution by chance?

Performance of HOFTS on erratic dataset.

I trained a HOFTS with order 3 on erratic data. Dataset is montly. I predicted on train data and it looks like it has overfitted.

Here is the code:

from pyFTS.models import hofts, pwfts

fig, ax = plt.subplots(nrows=1, ncols=1, figsize=[15,8])


ax.plot(train_uv[:100], label='Original')
rows = []
# for method in [hofts.HighOrderFTS, hofts.WeightedHighOrderFTS, pwfts.ProbabilisticWeightedFTS]:
for method in [hofts.HighOrderFTS]:
#     for order in [1, 2,3]:
    for order in [3]:
        model = method(partitioner=part, order=order)

        model.shortname += str(order)

        model.fit(train_uv)

        forecasts = model.predict(train_uv)
#         forecasts = model.predict([0,792,492], steps_ahead=142)
        forecast_fuzzy = forecasts
        for k in np.arange(order):
            forecasts.insert(0,None)

        ax.plot(forecasts[:100], label=model.shortname)

        models.append(model.shortname)

#         Util.persist_obj(model, model.shortname)

#         del(model)

handles, labels = ax.get_legend_handles_labels()
lgd = ax.legend(handles, labels, loc=2, bbox_to_anchor=(1, 1))

I want to predict next 3 months data ie. 12 data points. How I can use the "predict" function to do this?

This is how I tried.

fig, ax = plt.subplots(nrows=1, ncols=1, figsize=[15,8])

ax.plot(test_uv, label='Original')
forecasts = model.predict([1068,2280,4392], steps_ahead=12)
order = 3
for k in np.arange(order):
    forecasts.insert(0,None)
ax.plot(forecasts, label=model.shortname)
handles, labels = ax.get_legend_handles_labels()
lgd = ax.legend(handles, labels, loc=2, bbox_to_anchor=(1, 1))

Why is it remains constant after predicting 3 points in future?

[1068,2280,4392] is the last three data points of train dataset.

Originally posted by @pintuiitbhi in #6 (comment)

Unit testing

how to use hyperparams

I am struggle to find guidance about how to use hyperparam modul such as grid search or evolutionary. anyone can share ?

thank you

Fuzzyfication with Huarng Partitioner not woking (with fix for this)

My Huarng partitioner constists of 130 Fuzzy Sets and when I try to fit a model it says the item has to be between 0 and 10.
That is, the number of partitions within the partitioner is always set to 10 by default in all partitioners.

I wanted to fit a Chen model by using

partitioner = Huarng.HuarngPartitioner(data=train, mf=mf.trimf)
model = chen.ConventionalFTS(partitioner=partitioner)
model.fit(train)

and received the described error.

The issue can be fixed by adding:
self.partitions=npart
after line 39 in Huarng.py before the for loop.

This fixed this issue for me and it adapts the correct number of partitions.

Kind regards,
Th0m4sR

Error when importing chen

Jupyter show the following error when trying to import module 'chen'

ImportError Traceback (most recent call last)
in ()
----> 1 from pyFTS import chen

C:\Users\guilherme.marchezini\AppData\Local\Continuum\Anaconda3\lib\site-packages\pyFTS\chen.py in ()
1 import numpy as np
2 from pyFTS.common import FuzzySet, FLR
----> 3 import fts
4
5

ImportError: No module named 'fts'

error on sliding window benchmarks

partitioners_models is a list but the default value is None, hence the error

_tasks = (len(partitioners_models) * len(orders) * len(partitions) * len(transformations) * len(steps_ahead))
TypeError: object of type 'NoneType' has no len()

error on analytic_tabular_dataframe()

KeyError: '0'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last)
in ()
----> 1 dat = bUtil.analytic_tabular_dataframe(df1)

~/miniconda3/envs/time_series/lib/python3.6/site-packages/pyFTS/benchmarks/Util.py in analytic_tabular_dataframe(dataframe)
180 if not df.empty:
181 for col in data_columns:
--> 182 mod = [m, o, s, p, st, ms, df[col].values[0]]
183 ret.append(mod)

forecasting algorithms

Is there anywhere in the documentation more information on what algorithms does the forecasting with HighOrderFTS use?
I'd like to know more details on such things in order to iclude them in a perspective publication.

Thank you in advance
George