haidark / waveletdeconv Goto Github PK

Neural network layer code to implement wavelet deconvolutions

Python 100.00%

waveletdeconv's Introduction

WaveletDeconv

Neural network layer code written using Keras to implement Wavelet Deconvolutions from the paper:

Khan, Haidar, and Bulent Yener. "Learning filter widths of spectral decompositions with wavelets." Advances in Neural Information Processing Systems. 2018.

Requires Keras with a Tensorflow backend in addition to standard packages such as numpy, matplotlib, scipy, and h5py.

Run testWD.py to verify model saving, model loading, and proper functionality.

Deconvolutions of 1D signals using wavelets of different scales/widths. For a full description of the wavelet deconvolution method, see our paper

Code Example

    # apply a set of 5 wavelet deconv widthss to a sequence of 32 vectors with 10 timesteps
    model = Sequential()
    model.add(WaveletDeconvolution(5, kernel_length=200, padding='same', input_shape=(32, 10), data_format='channels_first'))
    # now model.output_shape == (None, 32, 10, 5)
    # add a conv2d on top
    model.add(Convolution2D(64, 3, 3, padding='same'))
    # now model.output_shape == (None, 64, 10, 5)

waveletdeconv's People

Contributors

Stargazers

Watchers

Forkers

sunssh algoricky tauhidstanford zengsn shownlin vivekmathema ammarkamoona sifat62 wuji1 convergedcn jimmy-inl

waveletdeconv's Issues

Support for TF2 and tf.keras?

Hi,

I came across this package and found it pretty interesting. I was wondering if you were planning support for TF2, specifically with tf.keras. I was able to get this to work with regular Keras but was unable to with tf.keras.

Code license?

Can you add a license to this repo?

https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/adding-a-license-to-a-repository

My preference would the the MIT license

Test result on artificial data in Section 5.1 in original paper

Hi,

This is a wonderful work! I was exploring the testing on artificial data part, as 5.1 in your original paper. But I couldn't achieve the result as shown in Figure 3, especially last plot. My naive thought is the vanishing gradients on the learnable filter width in the 1st layer. May I have your suggestions on training on this test data?
Based on the architecture description: "We train two networks on examples from each class and compare their performance. The baseline network is a 4 layer CNN with Max-pooling [21] ending with a single unit for classification. The other network replaces the first layer with a WD layer while maintaining the same number of parameters. Both networks are optimized with Adam [20] using a fixed learning rate of 0.001 and a batch size of 4.", I was implementing this network:

# -*- coding: utf-8 -*-

import scipy
import scipy.signal
import numpy as np
from matplotlib import pyplot as plt
import tensorflow as tf
from tensorflow.keras import layers, activations, initializers, constraints, regularizers
from tensorflow.keras.models import Sequential, model_from_json, load_model
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Convolution2D, MaxPool2D
from tensorflow.keras.initializers import Constant, RandomUniform, VarianceScaling
from matplotlib import pyplot as plt

# generate dummy data
N = 100
numSamps = 1000
data = np.random.random((N, 1, numSamps)).astype('float32')
labels = np.random.random((N, 1)).astype('float32')

val_data = np.random.random((N, 1, numSamps)).astype('float32')
val_labels = np.random.random((N, 1)).astype('float32')

X = np.linspace(-100, 100+1, numSamps)


for i in range(data.shape[0]):
    pure0 = np.sin(0.5*X)
    pure1 = np.sin(1*X)
    pure2 = np.sin(5*X)
    noise = np.random.normal(0, 1, numSamps)
    sig = np.zeros(X.shape)
    # pick 2 divider points
    a = np.random.randint(N/5, numSamps/2+1)
    b = np.random.randint(a+N/5, 2*numSamps/3+1)
    if i <= data.shape[0]/2:        
        sig[:a] = pure0[:a]
        sig[a:b] = pure1[a:b]
        sig[b:] = pure2[b:]
        label = 0
    else:
        sig[:a] = pure2[:a]
        sig[a:b] = pure1[a:b]
        sig[b:] = pure0[b:]      
        label = 1
    sig = sig + noise
    data[i,:,:] = sig
    labels[i] = label
# generat val data  
for i in range(val_data.shape[0]):
    pure0 = np.sin(0.5*X)
    pure1 = np.sin(1*X)
    pure2 = np.sin(5*X)
    noise = np.random.normal(0, 1, numSamps)
    sig = np.zeros(X.shape)
    # pick 2 divider points
    a = np.random.randint(0, numSamps/2)
    b = np.random.randint(a, numSamps+1)
    if i <= val_data.shape[0]/2:        
        sig[:a] = pure0[:a]
        sig[a:b] = pure1[a:b]
        sig[b:] = pure2[b:]
        label = 0
    else:
        sig[:a] = pure2[:a]
        sig[a:b] = pure1[a:b]
        sig[b:] = pure0[b:]      
        label = 1
    sig = sig + noise
    val_data[i,:,:] = sig
    val_labels[i] = label

print('data_scales = {:.2f}, {:.2f}, {:.2f}'.format(2.*np.pi/0.5, 2.*np.pi/1., 2.*np.pi/5.))

class Pos(constraints.Constraint):
    '''Constrain the weights to be strictly positive
    '''
    def __call__(self, p):
        p = p * tf.cast(p > 0., tf.float32)
        return p

class WaveletDeconvolution(layers.Layer):
    '''
    Deconvolutions of 1D signals using wavelets
    When using this layer as the first layer in a model,
    provide the keyword argument `input_shape`  as a
    (tuple of integers, e.g. (10, 128) for sequences
    of 10 vectors with dimension 128).
    
    # Example
    ```python
        # apply a set of 5 wavelet deconv widthss to a sequence of 32 vectors with 10 timesteps
        model = Sequential()
        model.add(WaveletDeconvolution(5, padding='same', input_shape=(32, 10)))
        # now model.output_shape == (None, 32, 10, 5)
        # add a new conv2d on top
        model.add(Convolution2D(64, 3, 3, padding='same'))
        # now model.output_shape == (None, 64, 10, 5)
    ```
    # Arguments
        nb_widths: Number of wavelet kernels to use
            (dimensionality of the output).
        kernel_length: The length of the wavelet kernels            
        init: Locked to didactic set of widths ([1, 2, 4, 8, 16, ...]) 
            name of initialization function for the weights of the layer
            (see [initializers](../initializers.md)),
            or alternatively, a function to use for weights initialization.
            This parameter is only relevant if you don't pass a `weights` argument.
        activation: name of activation function to use
            ( or alternatively, an elementwise function.)
            If you don't specify anything, no activation is applied
            (ie. "linear" activation: a(x) = x).
        weights: list of numpy arrays to set as initial weights.
        padding: one of `"valid"` or `"same"` (case-insensitive).
        strides: An integer or tuple/list of 2 integers,
            specifying the strides of the convolution
            along the height and width.
            Can be a single integer to specify the same value for
            all spatial dimensions.
        data_format: A string,
            one of `"channels_last"` or `"channels_first"`.
            The ordering of the dimensions in the inputs.
            `"channels_last"` corresponds to inputs with shape
            `(batch, height, width, channels)` while `"channels_first"`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".
        use_bias: Boolean, whether the layer uses a bias vector.
        kernel_regularizer: Regularizer function applied to
            the `kernel` weights matrix
        bias_regularizer: Regularizer function applied to the bias vector
        activity_regularizer: Regularizer function applied to
            the output of the layer (its "activation").
        kernel_constraint: Constraint function applied to the kernel matrix
        bias_constraint: Constraint function applied to the bias vector
    
    # Input shape
        if data_format is 'channels_first' then
            3D tensor with shape: `(batch_samples, input_dim, steps)`.
        if data_format is 'channels_last' then
            3D tensor with shape: `(batch_samples, steps, input_dim)`.
        
    # Output shape
        if data_format is 'channels_first' then
            4D tensor with shape: `(batch_samples, input_dim, new_steps, nb_widths)`.
            `steps` value might have changed due to padding.
        if data_format is 'channels_last' then
            4D tensor with shape: `(batch_samples, new_steps, nb_widths, input_dim)`.
            `steps` value might have changed due to padding.
    '''
    
    def __init__(self, nb_widths, kernel_length=100,
                 init='uniform', activation='linear', weights=None,
                 padding='same', strides=1, data_format='channels_last', use_bias=True,
                 kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None,
                 kernel_constraint=None, bias_constraint=None,
                 input_shape=None, **kwargs):

        if padding.lower() not in {'valid', 'same'}:
            raise Exception('Invalid border mode for WaveletDeconvolution:', padding)
        if data_format.lower() not in {'channels_first', 'channels_last'}:
            raise Exception('Invalid data format for WaveletDeconvolution:', data_format)
        self.nb_widths = nb_widths
        self.kernel_length = kernel_length
        self.init = self.didactic #initializers.get(init, data_format='channels_first')
        self.activation = activations.get(activation)
        self.padding = padding
        self.strides = strides

        self.subsample = (strides, 1)

        self.data_format = data_format.lower()

        self.kernel_regularizer = regularizers.get(kernel_regularizer)
        self.bias_regularizer = regularizers.get(bias_regularizer)
        self.activity_regularizer = regularizers.get(activity_regularizer)

        self.kernel_constraint = Pos()
        self.bias_constraint = constraints.get(bias_constraint)

        self.use_bias = use_bias
        self.initial_weights = weights
        super(WaveletDeconvolution, self).__init__(**kwargs)

    def build(self, input_shape):
        # get dimension and length of input
        if self.data_format == 'channels_first':
            self.input_dim = input_shape[1]
            self.input_length = input_shape[2]
        else:
            self.input_dim = input_shape[2]
            self.input_length = input_shape[1]
        # initialize and define wavelet widths
        self.W_shape = (self.nb_widths)
        # self.W = self.init(self.W_shape, name='{}_W'.format(self.name))
        # self.trainable_weights = [self.W]?
        # Constant(2.**np.arange(self.nb_widths)
        # Constant([1., 5., 12.]
        self.W = self.add_weight(shape = self.W_shape, 
                                 name = 'W',
                                 initializer = Constant([1., 4., 10.]),
                                 constraint = Pos())
        
        super(WaveletDeconvolution, self).build(input_shape)
        
    def call(self, x, mask=None):
        # shape of x is (batches, input_dim, input_len) if 'channels_first'
        # shape of x is (batches, input_len, input_dim) if 'channels_last'
        # we reshape x to channels first for computation
        if self.data_format == 'channels_last':
            x = tf.transpose(x, (0, 2, 1))

        #x = K.expand_dims(x, 2)  # add a dummy dimension for # rows in "image", now shape = (batches, input_dim, input_len, 1)
        
        # build the kernels to convolve each input signal with
        kernel_length = self.kernel_length
        T = (np.arange(0,kernel_length) - (kernel_length-1.0)/2).astype('float32')
        T2 = T**2
        # helper function to generate wavelet kernel for a given width
        # this generates the Mexican hat or Ricker wavelet. Can be replaced with other wavelet functions.
        def gen_kernel(w):
            w2 = w**2
            B = (3 * w)**0.5
            A = (2 / (B * (np.pi**0.25)))
            mod = (1 - (T2)/(w2))
            gauss = tf.exp(-(T2) / (2 * (w2)))
            kern = A * mod * gauss
            kern = tf.reshape(kern, [kernel_length, 1])
            return kern
        wav_kernels = []
        for i in range(self.nb_widths):
            kernel = gen_kernel(self.W[i])
            wav_kernels.append(kernel)
        wav_kernels = tf.stack(wav_kernels, axis=0)
        # kernel, _ = tf.map_fn(fn=gen_kernel, elems=self.W)
        wav_kernels = tf.expand_dims(wav_kernels, 0)
        wav_kernels = tf.transpose(wav_kernels,(0, 2, 3, 1))               

        # reshape input so number of dimensions is first (before batch dim)
        x = tf.transpose(x, (1, 0, 2))
        def gen_conv(x_slice):
            x_slice = tf.expand_dims(x_slice,1) # shape (num_batches, 1, input_length)
            x_slice = tf.expand_dims(x_slice,2) # shape (num_batches, 1, 1, input_length)
            return tf.nn.conv2d(x_slice, wav_kernels, strides=self.subsample, padding=self.padding, data_format='NCHW')
        outputs = []
        for i in range(self.input_dim):
            output = gen_conv(x[i,:,:])
            outputs.append(output)
        outputs = tf.stack(outputs, axis=0)
        # output, _ = tf.map_fn(fn=gen_conv, elems=x)
        outputs = tf.squeeze(outputs, 3)
        outputs = tf.transpose(outputs, (1, 0, 3, 2))
        if self.data_format == 'channels_last':
            outputs = tf.transpose(outputs,(0, 2, 3, 1))
        return outputs
                
    # def compute_output_shape(self, input_shape):
    #     out_length = conv_utils.conv_output_length(input_shape[2], 
    #                                                self.kernel_length, 
    #                                                self.padding, 
    #                                                self.strides)        
    #     return (input_shape[0], self.input_dim, out_length, self.nb_widths)
    
    def get_config(self):
        config = {'nb_widths': self.nb_widths,
                  'kernel_length': self.kernel_length,
                  'init': self.init.__name__,
                  'activation': self.activation.__name__,
                  'padding': self.padding,
                  'strides': self.strides,
                  'data_format': self.data_format,
                  'kernel_regularizer': self.kernel_regularizer.get_config() if self.kernel_regularizer else None,
                  'bias_regularizer': self.bias_regularizer.get_config() if self.bias_regularizer else None,
                  'activity_regularizer': self.activity_regularizer.get_config() if self.activity_regularizer else None,
                  'kernel_constraint': self.kernel_constraint.get_config() if self.kernel_constraint else None,
                  'bias_constraint': self.bias_constraint.get_config() if self.bias_constraint else None,
                  'use_bias': self.use_bias}
        base_config = super(WaveletDeconvolution, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))   
    
    def didactic(self, shape, name=None):
        x = 2**np.arange(shape).astype('float32')
        return tf.Variable(initial_value=x, name=name)

inp_shape = data.shape[1:]
model = Sequential()
model.add(WaveletDeconvolution(3, kernel_length=500, input_shape=inp_shape, padding='SAME', data_format='channels_first'))
model.add(Activation('tanh')) # (batch, 1, len=1000, 5)
model.add(MaxPool2D((1,2)))

model.add(Convolution2D(3, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPool2D((1,2)))

model.add(Convolution2D(3, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPool2D((1,2)))

model.add(Convolution2D(3, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPool2D((1,2)))
#end convolutional layers
model.add(Flatten())
model.add(Dense(25, kernel_initializer=VarianceScaling(mode='fan_avg', distribution='uniform')))
model.add(Activation('relu'))

model.add(Dense(1, kernel_initializer=VarianceScaling(mode='fan_avg', distribution='uniform')))
model.add(Activation('sigmoid'))

optimizer_0 = tf.keras.optimizers.Adam(learning_rate=10.**-3)
model.compile(optimizer=optimizer_0, loss='binary_crossentropy')

num_epochs = 25
plt.figure(figsize=(6,6))
Widths = np.zeros((num_epochs, 3)).astype('float32')
for i in range(num_epochs):
    hWD = model.fit(data, labels, epochs=1, batch_size=4, validation_data=(val_data, val_labels), verbose=0)

    print('Epoch %3d | train_loss: %.4f | val_loss: %.4f' % (i+1, hWD.history['loss'][0], hWD.history['val_loss'][0]))

    Widths[i,:] = model.layers[0].weights[0].numpy()
    plt.plot(i, hWD.history['loss'][0], 'k.')
    plt.plot(i, hWD.history['val_loss'][0], 'r.')

plt.figure(figsize=(6,6))
for i in range(Widths.shape[1]):
    plt.plot(range(num_epochs), Widths[:,i]) 

plt.show()

Learnable

The weights of the neural network layer are not learnable but fixed.
I had to add "self.add_weight(...)", to make the network learn the scales.
Is this intentional?

Weights for model 'sequential'

/bin/python3.8 /workspaces/WaveletDeconv/testWD.py
2023-06-05 09:55:20.305859: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Testing saving...
Traceback (most recent call last):
File "/workspaces/WaveletDeconv/testWD.py", line 91, in
modelWD.save_weights('testWD_model.h5')
File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 3540, in _assert_weights_created
raise ValueError(
ValueError: Weights for model 'sequential' have not yet been created. Weights are created when the model is first called on inputs or build() is called with an input_shape.

Some questions about "Learning filter widths of spectral decompositions using wavelets"

Hello!

Thank you very much for the "WaveletDeconv" open source project published on Github and the publication of "Learning filter widths of spectral decompositions with wavelets" Academic papers, which have been of great help to my academic research. When I was debugging the neural network model constructed in your "WaveletDeconv" open source project and reading the description of the model(Figure 1) in the paper, I had some doubts. I hope you could answer them! Here are some of my doubts.

Q1: As shown in Figure 1, the following is my understanding of the model diagram constructed in the article. Is it appropriate? The first layer is the input layer (A), which is mainly used for receiving signals. The second layer is the deconvolution layer (B), which mainly deconvolution processing the output data of the input layer. The third layer is the convolutional layer (C), which mainly carries out convolution processing on the output data of the deconvolutional layer. The fourth part is the output layer (D), which is mainly responsible for receiving the data of the convolutional layer and output the final result value.

But with you in the description of the paper, can infer deconvolution layer as a preprocessing step, input signal, the output is after transformation of the signal, in order to optimize parameters, reduce the gradient, so I guess the first layer should be deconvolution, it served as the input signal processing, deconvolution processing again. At the same time, the deconvolution layer is also the first layer in your project code. I wonder if My understanding is appropriate?

Q2: I didn't find the description of the first layer in the paper and the code, so I didn't understand what the input and output values of A1 and A2 in the first layer were. We look forward to your reply:

What is the first layer mainly responsible for?
Why is there a small layer(X) between A1 and A2, and what data is stored there?
Is the deconvolution layer in the second layer of the diagram, but why is it in the code as the first layer in the model? And it says in the description of the paper that what is received directly is the signal, so should the deconvolution layer be the first layer in the diagram and not the second?

Thank you for reading my letter. I sincerely wish you and your family good health and smooth work. Looking forward to reply. I am looking forward to further communication with you, both academically and in daily life. Please give me more advice. Thank you.

(Figure 1)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble