GithubHelp home page GithubHelp logo

branden-kang / learning-shapelets Goto Github PK

View Code? Open in Web Editor NEW

This project forked from benibaeumle/learning-shapelets

0.0 1.0 0.0 73.75 MB

A PyTorch implementation of learning shapelets from the paper Grabocka et al., „Learning Time-Series Shapelets“.

License: MIT License

Python 39.94% Jupyter Notebook 60.06%

learning-shapelets's Introduction

Learning Time Series Shapelets

Shapelets are discriminative sub-sequences of time series that best predict the target variable. For this reason, shapelet discovery has recently attracted considerable interest within the time-series research community. Currently shapelets are found by evaluating the prediction qualities of numerous candidates extracted from the series segments. In contrast to the state-of-the-art, this paper proposes a novel perspective in terms of learning shapelets. A new mathematical formalization of the task via a classification objective function is proposed and a tailored stochastic gradient learning algorithm is applied. The proposed method enables learning nearto-optimal shapelets directly without the need to try out lots of candidates. Furthermore, our method can learn true top-K shapelets by capturing their interaction. Extensive experimentation demonstrates statistically significant improvement in terms of wins and ranks against 13 baselines over 28 time-series datasets.

A PyTorch implementation of learning shapelets from the paper

Josif Grabocka, Nicolas Schilling, Martin Wistuba, and Lars Schmidt-Thieme. 2014. Learning time-series shapelets. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '14). Association for Computing Machinery, New York, NY, USA, 392–401.

This implementation deviates from the paper by implementing multiple distance measures (euclidean distance, cross-correlation, and cosine similarity) and we use a hard-mininmum function compared to a soft-minimum in the paper.

See the paper for a more detailed description.

Shapelet Regularizer

When training with cross-entropy loss the learned shapelets might not resemble actual patterns of the dataset. We performed experiments with an adapted loss function, introducing two regularizers to steer the learned shapelets to better resemble patterns of the data. With the right strength for the regularizers we can consistently learn shapelets that well approximate patterns of the dataset without loosing accuracy significantly.

We performed experiments on 85 of the UCR archive. See this notebook for a description of the loss and for a presentation of the results.

This demo further demonstrates the regularized loss on one of the UCR datasets.

A very similar idea was introduced in this paper.

How to use

loss_func = nn.CrossEntropyLoss()
# Learning Shapelets as of the paper 'Learning Time Series Shapelets', Grabocka et al. (2014)
learning_shapelets = LearningShapelets(shapelets_size_and_len=shapelets_size_and_len,
                                       in_channels=n_channels,
                                       num_classes=num_classes,
                                       loss_func=loss_func,
                                       to_cuda=True,
                                       verbose=1,
                                       dist_measure=dist_measure)
# OR Learning Shapelets with regularized loss
learning_shapelets = LearningShapelets(shapelets_size_and_len=shapelets_size_and_len,
                                       in_channels=n_channels,
                                       num_classes=num_classes,
                                       loss_func=loss_func,
                                       to_cuda=True,
                                       verbose=1,
                                       dist_measure=dist_measure,
                                       l1=l1,
                                       l2=l2,
                                       k=k)

# (Optionally) Initialize shapelet weights, the original paper uses k-Means
# otherwise the shapelets will be initialized randomly.
# Note: This implementation does not provide an initialization strategy other
# than random initialization.
learning_shapelets.set_shapelet_weights(weights)

# Initialize an optimizer.
# Make sure to first set the weights otherwise the model parameters will have changed.
optimizer = optim.Adam(learning_shapelets.model.parameters(),
                       lr=lr,
                       weight_decay=wd)
learning_shapelets.set_optimizer(optimizer)

# Train model.
losses = learning_shapelets.fit(X_train,
                                y_train,
                                epochs=2000,
                                batch_size=256,
                                shuffle=False,
                                drop_last=False)

# Extract the learned shapelets.
shapelets = learning_shapelets.get_shapelets()

# Shapelet transform a dataset.
shapelet_transform = learning_shapelets.transform(X_test)

# Make predictions.
predictions = learning_shapelets.predict(X_test)

See the demo for a detailed example.

Parameters

--shapelets_size_and_len    dict(int, int)        The keys are the length of the shapelets
                                                  and the values the number of shapelets of
                                                  a given length, e.g. {40: 4, 80: 4} learns
                                                  4 shapelets of length 40 and 4 shapelets of
                                                  length 80.
--in_channels               int                   The number of channels of the time series.      Default: 1
--num_classes               int                   The number of output classes.                   Default: 2
--loss_func                 torch.nn              The loss function.
--dist_measure              'euclidean',          The distance measure to use to compute the      Default: 'euclidean'
                            'cross-correlation',  distances between the shapelets and the
                            or 'cosine'           the series.
--to_cuda                   bool                  Performs computations on GPU is true.           Default: True
                                                  Needs PyTorch with GPU support.
--verbose                   int                   Monitors training loss if set to 1.             Default: 0
--l1                        float                 Value for the regularizer L_dk.                 Default: 0.0
--l2                        float                 Value for the regularizer L_ss.                 Default: 0.0
--k                         int                   Value for top k parameter of regularizer L_dk   Default: 0

Methods

  • fit(X, Y, epochs=1, batch_size=256, shuffle=False, drop_last=False): Learn time-series shapelets. If shuffle is true, the batches are shuffled after each epoch If drop_last is true, the last batch will be ignored if it is smaller than the batch size.
  • transform(X, Y): Perform shapelet transform
  • fit_transform(X, Y, epochs=1, batch_size=256, shuffle=False, drop_last=False): Perform fit followed by transform.
  • set_optimizer(optimizer): Set an optimizer for training.
  • set_shapelet_weights(weights): Initialize shapelets.
  • set_shapelet_weights_of_block(i, weights_block): Initialize the weights of a shapelet block, blocks are sorted ascending according to the length of the shapelets
  • update(x, y): Perform a single gradient update step with respect to batch (x, y).
  • update_regularized(x, y): Performs a single grdient update step with respect to batch (x, y) with the regularized loss function
  • predict(X): Classify X.
  • get_shapelets(): Get the learned shapelets.
  • get_weights_linear_layer(): Get the weights and biases of the logistic regression layer

License

Released under MIT License. The code was developed by Benedikt Bäumle.

Credits

The code was developed with the aid of the following libraries:

learning-shapelets's People

Contributors

benibaeumle avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.