GithubHelp home page GithubHelp logo

Comments (4)

MaxBenChrist avatar MaxBenChrist commented on August 24, 2024

Hi Thomas,

Can you add a complete code snippet for this example?

Best

Max

Von meinem iPhone gesendet

Am 11.11.2016 um 21:42 schrieb Tomasz Wrona [email protected]:

Hi,

I tried to run tsfresh on my sample data (2 time series). After calling extract_features I received following matrix:

1003 feature_1 feature_2 ...
1004 feature_2 feature_3 ...

Then I call select_features like this:

ys = pd.Series([1, 2], index = [1003, 1004], name = 'target')
select_features(features, ys)
But all I receive is an empty DataFrame. What I am doing wrong?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

from tsfresh.

iamhatesz avatar iamhatesz commented on August 24, 2024

Sure.
Let's assume that sampleRecord is a DataFrame consisting of data for two time series (10000 rows each) as follows:

          t  signal  signal_filtered  personRecordId
0         0     -25              -31            1003
1         1     -29              -24            1003
2         2     -21              -17            1003
3         3     -30              -12            1003
...
19997  9997     111               20            1004
19998  9998     134               21            1004
19999  9999     137               23            1004
[20000 rows x 4 columns]

I want to use signal_filtered as a value. personRecordId is my time series identifier and t is a time mark.

import tsfresh as ts
import pandas as pd

sampleFeatures = ts.extract_features(sampleRecord, column_id = "personRecordId", column_sort = "t", column_kind = None, column_value = "signal_filtered")
sampleSelectedFeatures = ts.select_features(sampleFeatures, pd.Series([1,2], index=[1003,1004], name='target'))

It returns:

Empty DataFrame
Columns: []
Index: [1003, 1004]

from tsfresh.

MaxBenChrist avatar MaxBenChrist commented on August 24, 2024

The problem is that you do not have enough samples, only 2. The filtering method is very restrictive to control the fdr (=percentage of irrelevant added features) for any distribution and dependency structure.

If you do not have enough statistics (=number of samples) the feature filtering will treat the features as unimportant because it could be that observed correlations only occur by chance. For really small sample sizes almost no features will be extracted.

from tsfresh.

MaxBenChrist avatar MaxBenChrist commented on August 24, 2024

Just try to add more samples! :)

from tsfresh.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.