GithubHelp home page GithubHelp logo

tensorly / torch Goto Github PK

View Code? Open in Web Editor NEW
70.0 8.0 18.0 3.91 MB

TensorLy-Torch: Deep Tensor Learning with TensorLy and PyTorch

Home Page: http://tensorly.org/torch/

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
pytorch tensor deep-learning tensor-learning deep-neural-networks factorized-cnns tensor-regression tensor-contraction tensor-networks tensor-convolution-networks

torch's Introduction

https://codecov.io/gh/tensorly/tensorly/branch/master/graph/badge.svg?token=mnZ234sGSA https://img.shields.io/badge/Slack-join-brightgreen

TensorLy

TensorLy is a Python library that aims at making tensor learning simple and accessible. It allows to easily perform tensor decomposition, tensor learning and tensor algebra. Its backend system allows to seamlessly perform computation with NumPy, PyTorch, JAX, TensorFlow or CuPy, and run methods at scale on CPU or GPU.


Installing TensorLy

The only pre-requisite is to have Python 3 installed. The easiest way is via the Anaconda distribution.

With pip (recommended) With conda
pip install -U tensorly
conda install -c tensorly tensorly
Development (from git)
# clone the repository
git clone https://github.com/tensorly/tensorly
cd tensorly
# Install in editable mode with `-e` or, equivalently, `--editable`
pip install -e .

Note: TensorLy depends on NumPy by default. If you want to use other backends, you will need to install these packages separately.

For detailed instruction, please see the documentation.


Quickstart

Creating tensors

Create a small third order tensor of size 3 x 4 x 2, from a NumPy array and perform simple operations on it:

import tensorly as tl
import numpy as np


tensor = tl.tensor(np.arange(24).reshape((3, 4, 2)), dtype=tl.float64)
unfolded = tl.unfold(tensor, mode=0)
tl.fold(unfolded, mode=0, shape=tensor.shape)

You can also create random tensors:

from tensorly import random

# A random tensor
tensor = random.random_tensor((3, 4, 2))
# A random CP tensor in factorized form
cp_tensor = random.random_tensor(shape=(3, 4, 2), rank='same')

You can also create tensors in TT-format, Tucker, etc, see random tensors.

Setting the backend

You can change the backend to perform computation with a different framework. By default, the backend is NumPy, but you can also perform the computation using PyTorch, TensorFlow, JAX or CuPy (requires to have installed them first). For instance, after setting the backend to PyTorch, all the computation is done by PyTorch, and tensors can be created on GPU:

tl.set_backend('pytorch') # Or 'numpy', 'tensorflow', 'cupy' or 'jax'
tensor = tl.tensor(np.arange(24).reshape((3, 4, 2)), device='cuda:0')
type(tensor) # torch.Tensor

Tensor decomposition

Applying tensor decomposition is easy:

from tensorly.decomposition import tucker
# Apply Tucker decomposition
tucker_tensor = tucker(tensor, rank=[2, 2, 2])
# Reconstruct the full tensor from the decomposed form
tl.tucker_to_tensor(tucker_tensor)

We have many more decompositions available, be sure to check them out!

Next steps

This is just a very quick introduction to some of the basic features of TensorLy. For more information on getting started, checkout the user-guide and for a detailed reference of the functions and their documentation, refer to the API

If you see a bug, open an issue, or better yet, a pull-request!


Contributing code

All contributions are welcome! So if you have a cool tensor method you want to add, if you spot a bug or even a typo or mistake in the documentation, please report it, and even better, open a Pull-Request on GitHub.

Before you submit your changes, you should make sure your code adheres to our style-guide. The easiest way to do this is with black:

pip install black
black .

Running the tests

Testing and documentation are an essential part of this package and all functions come with uni-tests and documentation.

The tests are ran using the pytest package. First install pytest:

pip install pytest

Then to run the test, simply run, in the terminal:

pytest -v tensorly

Alternatively, you can specify for which backend you wish to run the tests:

TENSORLY_BACKEND='numpy' pytest -v tensorly

Citing

If you use TensorLy in an academic paper, please cite [1]:

@article{tensorly,
  author  = {Jean Kossaifi and Yannis Panagakis and Anima Anandkumar and Maja Pantic},
  title   = {TensorLy: Tensor Learning in Python},
  journal = {Journal of Machine Learning Research},
  year    = {2019},
  volume  = {20},
  number  = {26},
  pages   = {1-6},
  url     = {http://jmlr.org/papers/v20/18-277.html}
}
[1]Jean Kossaifi, Yannis Panagakis, Anima Anandkumar and Maja Pantic, TensorLy: Tensor Learning in Python, Journal of Machine Learning Research (JMLR), 2019, volume 20, number 26.

torch's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

torch's Issues

Warnings end error using TuckerTRL with multiple GPUs

Hi,

when using TurckerTRL I get this warning when running either on one or multiple GPUs

Using one GPU:

 /root/env/lib/python3.7/site-packages/torch/nn/modules/container.py:435: UserWarning: Setting attributes on ParameterList is not supported.
  warnings.warn("Setting attributes on ParameterList is not supported.")

Using multiple GPUs

/root/env/lib/python3.7/site-packages/torch/nn/modules/container.py:490: UserWarning: nn.ParameterList is being used with DataParallel but this is not supported. This list will appear empty for the models replicated on each GPU except the original one.
  warnings.warn("nn.ParameterList is being used with DataParallel but this is not "

and then when training always with multiple GPUs I get this error:

/root/env/lib/python3.7/site-packages/torch/nn/modules/container.py:490: UserWarning: nn.ParameterList is being used with DataParallel but this is not supported. This list will appear empty for the models replicated on each GPU except the original one.
  warnings.warn("nn.ParameterList is being used with DataParallel but this is not "
Traceback (most recent call last):
  File "main.py", line 458, in <module>
    main(args)
  File "main.py", line 188, in main
    train(config_file)
  File "main", line 248, in train
    output = model.forward(data_batch)
  File "/root/env/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/root/env/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/root/env/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
    output.reraise()
  File "/root/env/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
    raise self.exc_type(msg)
AttributeError: Caught AttributeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/root/env/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/root/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "model.py", line 75, in forward
    x2 = self.trl(x1)
  File "/root/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/env/lib/python3.7/site-packages/tltorch/_trl.py", line 162, in forward
    regression_weights = tl.tucker_to_tensor((core, factors))
  File "/root/env/lib/python3.7/site-packages/tensorly/tucker_tensor.py", line 63, in tucker_to_tensor
    return multi_mode_dot(core, factors, skip=skip_factor, transpose=transpose_factors)
  File "/root/env/lib/python3.7/site-packages/tensorly/tenalg/__init__.py", line 79, in dynamically_dispatched_fun
    current_backend = _BACKENDS[_LOCAL_STATE.tenalg_backend]
AttributeError: '_thread._local' object has no attribute 'tenalg_backend'

Tensorized Embedding Layers

Recent work has shown promising results on tensorized embedding layers for memory reduction.

I have a tensorly-based implementation of an embedding layer (lookup table) that supports the CP/Tucker/TTM/TT format. If that's of interest, I would be happy to submit a PR to this repo. This implementation constructs the appropriate rows without forming the full embedding table.

Tensorized Matrices vs Factorized Tensors

Hi,

Thanks for creating and maintaining this library. I had a couple of basic questions, would be great if you could answer:

  1. What is the difference between the files in tltorch/factorized_tensors/factorized_tensors.py and tltorch/factorized_tensors/tensorized_matrices.py? A lot of the code is replicated across these files.

  2. Is BlockTT the same as Tensor-train?

  3. I know its trivial to implement, but does the library similarly have a module for low-rank matrix factorization?

Thanks,

Factorized Tensor slower than Neural Network Layer !!!

import tltorch
import torch
from torch.profiler import profile, record_function, ProfilerActivity

data = torch.randn((4, 16), dtype=torch.float32)
linear = torch.nn.Linear(16, 10)

fact_linear = tltorch.FactorizedLinear.from_linear(linear, auto_tensorize=False,
                    in_tensorized_features=(4, 4), out_tensorized_features=(2, 5), rank=0.1, factorization="tucker")

data = data.to("cuda")
linear = linear.to("cuda")
fact_linear = fact_linear.to("cuda")
with profile(activities=[
        ProfilerActivity.CPU, ProfilerActivity.CUDA], record_shapes=True) as prof:
    with record_function("model_inference"):
        linear(data)

print(prof.key_averages().table(sort_by="cuda_time_total", row_limit=10))

-------------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
                                                   Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg     Self CUDA   Self CUDA %    CUDA total  CUDA time avg    # of Calls  
-------------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
                                        model_inference        16.99%       1.054ms        99.71%       6.186ms       6.186ms       0.000us         0.00%       4.000us       4.000us             1  
                                           aten::linear         0.29%      18.000us        82.72%       5.132ms       5.132ms       0.000us         0.00%       4.000us       4.000us             1  
                                            aten::addmm        60.54%       3.756ms        81.43%       5.052ms       5.052ms       4.000us       100.00%       4.000us       4.000us             1  
void gemmSN_TN_kernel<float, 128, 16, 2, 4, 4, 4, tr...         0.00%       0.000us         0.00%       0.000us       0.000us       4.000us       100.00%       4.000us       4.000us             1  
                                                aten::t         0.63%      39.000us         1.00%      62.000us      62.000us       0.000us         0.00%       0.000us       0.000us             1  
                                        aten::transpose         0.24%      15.000us         0.37%      23.000us      23.000us       0.000us         0.00%       0.000us       0.000us             1  
                                       aten::as_strided         0.13%       8.000us         0.13%       8.000us       8.000us       0.000us         0.00%       0.000us       0.000us             1  
                                       cudaLaunchKernel        20.89%       1.296ms        20.89%       1.296ms       1.296ms       0.000us         0.00%       0.000us       0.000us             1  
                                  cudaDeviceSynchronize         0.29%      18.000us         0.29%      18.000us      18.000us       0.000us         0.00%       0.000us       0.000us             1  
-------------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
Self CPU time total: 6.204ms
Self CUDA time total: 4.000us

with profile(activities=[
        ProfilerActivity.CPU, ProfilerActivity.CUDA], record_shapes=True) as prof:
    with record_function("model_inference"):
        fact_linear(data)

print(prof.key_averages().table(sort_by="cuda_time_total", row_limit=10))

-------------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
                                                   Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg     Self CUDA   Self CUDA %    CUDA total  CUDA time avg    # of Calls  
-------------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
                                        model_inference        15.19%       1.098ms        99.79%       7.215ms       7.215ms       0.000us         0.00%      27.000us      27.000us             1  
                                           aten::matmul         0.40%      29.000us        62.63%       4.528ms       1.132ms       0.000us         0.00%      12.000us       3.000us             4  
                                               aten::mm        48.37%       3.497ms        62.23%       4.499ms       1.125ms      12.000us        44.44%      12.000us       3.000us             4  
                                          aten::reshape         0.91%      66.000us         6.10%     441.000us      44.100us       0.000us         0.00%      10.000us       1.000us            10  
                                            aten::clone         0.55%      40.000us         3.91%     283.000us      94.333us       0.000us         0.00%      10.000us       3.333us             3  
                                            aten::copy_         1.40%     101.000us         2.28%     165.000us      55.000us      10.000us        37.04%      10.000us       3.333us             3  
void at::native::elementwise_kernel<128, 2, at::nati...         0.00%       0.000us         0.00%       0.000us       0.000us      10.000us        37.04%      10.000us       3.333us             3  
void gemmk1_kernel<int, float, 256, 5, false, false,...         0.00%       0.000us         0.00%       0.000us       0.000us       9.000us        33.33%       9.000us       3.000us             3  
                                           aten::linear         0.36%      26.000us         2.23%     161.000us     161.000us       0.000us         0.00%       5.000us       5.000us             1  
                                            aten::addmm         1.00%      72.000us         1.27%      92.000us      92.000us       5.000us        18.52%       5.000us       5.000us             1  
-------------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
Self CPU time total: 7.230ms
Self CUDA time total: 27.000us

`tltorch.FactorizedConv.from_conv` tries to allocate ~83GB memory for an input of shape (1024, 512, 3, 3, 3)

Minimal Code to reproduce the error:

import torch
import tltorch
test_conv3d = torch.nn.Conv3d(1024, 512, (3,3,3), padding=(3,1,1))
print(tltorch.FactorizedConv.from_conv(test_conv3d, rank='same', factorization='cp'))

Error:

RuntimeError: [enforce fail at alloc_cpu.cpp:83] err == 0. DefaultCPUAllocator: can't allocate memory: you tried to allocate 89060441849856 bytes. Error code 12 (Cannot allocate memory)

The error is actually coming from this line in the truncated_svd function of TensorLy. The shape of the matrix passed to the svd function is torch.Size([3, 4718592]). Note, this error is not thrown when when torch.svd is directly used. The size of the matrix is only ~54 MBs, it's strange the tl.svd tries to allocate 83 GBs for it.

It's also possible that I'm making a very stupid mistake, in any case, looking forward to some solution here ๐Ÿ™ @JeanKossaifi

Support for ONNX export

Thank you for the library.
Have you considered adding support for exporting factorized layers (FactorizedConv) to ONNX?

Documentation update?

Hi! I was trying to use the tltorch.factorized_layers.FactorizedLinear, but it seems like the parameters listed on the documentation page http://tensorly.org/torch/dev/modules/generated/tltorch.factorized_layers.FactorizedLinear.html#tltorch.factorized_layers.FactorizedLinear are different from the parameters the model actually takes in now, and I'm not sure how the old parameters are mapped to the new parameter. In particular, I wonder how we can specify the shape of the tensor with the current model parameters in_tensorized_features and out_tensorized_features, and it would be great if you could update the documentation to match the current definition of the model.
Thank you very much!

Contiguous Tucker core and factors

This line of the Tucker Tensor reads

return cls(nn.Parameter(core.contiguous), [nn.Parameter(f) for f in factors])

I believe core.contiguous should be core.contiguous() and this is just a small typo.

Also, the factors are not necessarily contiguous, which raised an error for my use case even after correcting the typo above. I think nn.Parameter(f.contiguous()) is a natural replacement, and I don't see any downsides. In sum, replace the line above with

return cls(nn.Parameter(core.contiguous()), [nn.Parameter(f.contiguous()) for f in factors])

I can submit this as a PR, but haven't done a public/tested PR before, so might need help contributing after my initial PR.

FactorizedEmbedding doesn't work with non-contiguous input

I'm using version 0.4.0 of tensorly-torch and am running into an issue with non-contiguous inputs to a FactorizedEmbedding layer raising a runtime error. Here's a minimum working example:

import torch
from tltorch.factorized_layers import FactorizedEmbedding

embedding = FactorizedEmbedding(32, 16)
data = torch.randint(0, 32, (2, 3))
embedding(data.T)

Running this yields:

Traceback (most recent call last):
  File "tltorch_bug_mwe.py", line 6, in <module>
    embedding(data.T)
  File "/Users/jemis/opt/miniconda3/envs/tensorized3.8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/jemis/opt/miniconda3/envs/tensorized3.8/lib/python3.8/site-packages/tltorch/factorized_layers/factorized_embedding.py", line 100, in forward
    flatenned_input = input.view(-1)
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Replacing input.view(-1) with input.reshape(-1) does indeed resolve this.

FactorizedConv

When I try to use FactorizedConv, it reports such error:

UserWarning: Creating a subclass of FactorizedTensor TensorizedTensor with no name.
warnings.warn(f'Creating a subclass of FactorizedTensor {cls.name} with no name.')

Reconstructed in from_conv is no showing Reconstructed layers but the factorized layers

Hi,

I am trying to from_conv with reconstructed as implementation choice but the layer showing as a factorized layer not as a reconstructed layer.

'''
layer = nn.Conv2d(128,64,3)
factorized_layer = tltorch.FactorizedConv.from_conv(layer,implementation='reconstructed')
'''
FactorizedConv(
in_channels=128, out_channels=64, kernel_size=(3, 3), rank=372, order=2,
(weight): CPTensor(shape=(64, 128, 3, 3), rank=372)
)

also, I can access its factors,
factorized_layer.weight.factors

I don't want that I just want to access the reconstructed layer until I can access layer.weight.data.

Thanks

Add more initialization methods

Hello. Currently, the default initialization for the linear layers is a random tensor with std=0.02.
I suggest adding more initialization methods such as Xavier (Glorot) and Kaiming (He).
http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf
https://arxiv.org/pdf/1502.01852.pdf
I have also previously experienced that initializing the cores so that the reconstructed matrix has expected Frobenius norm of 1 is sometimes helpful. Not sure if it's useful to be added though.

`torch.jit.script` does not work with Tensorized Models

Minimal Code:

import torch
from torch.nn import Module
from tltorch import FactorizedConv

class Test(Module):
    def __init__(self):
        super(Test, self).__init__()
        self.layer = FactorizedConv(3, 4, 3, factorization='tucker', order=3)

def main():
# Instantiate the model
    model = Test()
    scripted_module = torch.jit.script(model)

if __name__ == "__main__":
    main()

Error:

Traceback (most recent call last):
  File "/workspaces/RepNet-Rex-Solutions/test.py", line 27, in <module>
    main()
  File "/workspaces/RepNet-Rex-Solutions/test.py", line 24, in main
    save_model(model)
  File "/workspaces/RepNet-Rex-Solutions/test.py", line 8, in save_model
    scripted_module = torch.jit.script(model)
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/_script.py", line 1324, in script
    return torch.jit._recursive.create_script_module(
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/_recursive.py", line 559, in create_script_module
    return create_script_module_impl(nn_module, concrete_type, stubs_fn)
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/_recursive.py", line 632, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/_script.py", line 639, in _construct
    init_fn(script_module)
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/_recursive.py", line 608, in init_fn
    scripted = create_script_module_impl(
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/_recursive.py", line 632, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/_script.py", line 639, in _construct
    init_fn(script_module)
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/_recursive.py", line 608, in init_fn
    scripted = create_script_module_impl(
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/_recursive.py", line 572, in create_script_module_impl
    method_stubs = stubs_fn(nn_module)
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/_recursive.py", line 899, in infer_methods_to_compile
    stubs.append(make_stub_from_method(nn_module, method))
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/_recursive.py", line 87, in make_stub_from_method
    return make_stub(func, method_name)
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/_recursive.py", line 71, in make_stub
    ast = get_jit_def(func, name, self_name="RecursiveScriptModule")
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/frontend.py", line 372, in get_jit_def
    return build_def(
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/frontend.py", line 422, in build_def
    param_list = build_param_list(ctx, py_def.args, self_name, pdt_arg_types)
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/torch/jit/frontend.py", line 448, in build_param_list
    raise NotSupportedError(ctx_range, _vararg_kwarg_err)
torch.jit.frontend.NotSupportedError: Compiled functions can't take variable number of arguments or use keyword-only arguments with defaults:
  File "/usr/local/python/3.10.8/lib/python3.10/site-packages/tltorch/factorized_tensors/core.py", line 259
    def forward(self, indices=None, **kwargs):
                                     ~~~~~~~ <--- HERE
        """To use a tensor factorization within a network, use ``tensor.forward``, or, equivalently, ``tensor()`

The main issue here is torch.jit.script doesn't support variable number of arguments and keyword-only arguments with defaults which are present in the forward function of the factorized/tensorized layers.

Operating on the Decomposed Form?

First of all, thank you guys for creating this wonderful library, and making it public for us to experiment.
This is likely more off an issue with user error on my end than the code itself.

I'm trying to get a small convolutional network to work with the CIFAR-100 dataset, with two fully connected layers at the end factorized using blocktt. While the network is able to train fine, and achieved nearly the same accuracy as the non-tensorized counterpart, the training is a bit slow, and I am getting the following warning:

UserWarning: BlockTT, shape=[512, 512], tensorized_shape=((8, 8, 8), (32, 4, 4)), rank=[1, 4, 4, 1]) is being reconstructed into a matrix, consider operating on the decomposed form. warnings.warn(f'{self} is being reconstructed into a matrix, consider operating on the decomposed form.')

I don't quite understand this warning message. Am I supposed decompose the feature tensor first (e.g. into TT or tucker format) before passing it through the factorized linear layers? I couldn't quite understanding from just reading the documentation on how to do inside a torch.nn.Module block.

I have attached my source code here.

class Block(torch.nn.Module):
    def __init__(self):
        super().__init__()
        
        self.conv1 = torch.nn.Conv2d(3, 8, 3, 2, 1)
        self.conv2 = torch.nn.Conv2d(8, 16, 3, 2, 1)
        self.conv3 = torch.nn.Conv2d(16, 32, 3, 2, 1)
        self.fc1 = tltorch.FactorizedLinear((32, 4, 4), (8, 8, 8), factorization='blocktt', rank=(1, 4, 4, 1))
        self.fc2 = tltorch.FactorizedLinear((8, 8, 8), (8, 8, 8), factorization='blocktt', rank=(1, 4, 4, 1))
        self.fc3  = torch.nn.Linear(512, 100)

    def forward(self, inputs):
        outputs = F.relu(self.conv3(F.relu(self.conv2(F.relu(self.conv1(inputs))))))
        outputs = outputs.flatten(-3, -1)
        outputs = F.relu(self.fc1(outputs))
        outputs = F.relu(self.fc2(outputs))
        outputs = self.fc3(outputs)
        return outputs

FactorizedConv

When I try to use FactorizedConv, it reports such error:

UserWarning: Creating a subclass of FactorizedTensor TensorizedTensor with no name.
warnings.warn(f'Creating a subclass of FactorizedTensor {cls.name} with no name.')

Tensor Regression Layer

I am trying to use tensor regression layers instead of the fully connected layer in my model, the paper claim I can just replace the fully connected layer which removed the need to flatten the tensor and keep the spatial information which increase the model accuracy.

I did this with small network

`
import torch.nn as nn
import torch.nn.functional as F
import torch

class Net(nn.Module):

def __init__(self):
    super(Net, self).__init__()
    self.conv1  = nn.Conv2d(in_channels=1,out_channels=6,kernel_size=(5,5),stride=(1,1),padding=2)
    self.pool1  = nn.MaxPool2d(kernel_size=2)
    self.conv2  = nn.Conv2d(in_channels=6,out_channels=16,kernel_size=(5,5),stride=(1,1),padding=0)
    self.pool2  = nn.MaxPool2d(kernel_size=2)
    self.linear1 = nn.Linear(16*5*5,120)
    self.linear2 = tltorch.TRL(120,84,factorization='tt',rank=100)
    self.classifier = nn.Linear(84,10)

def forward(self, x):
    out = self.conv1(x)
    out = self.pool1(out)
    out = self.conv2(out)
    out = self.pool2(out)
    #out = torch.flatten(out,1)
    out = self.linear1(out)
    out = self.linear2(out)
    out =self.classifier(out)


    return out

`

training the model the loss is always nan. If there any example shows how to use tensor regression layer in tensor contraction layer in the state of the art models will be very helpful.

BlockTT does not support CUDA Tensor as indices

When I use FactorizedEmbedding, I found that it does not support CUDA Tensor as input indices.
Because in BlockTT.__getitem__(), it uses np.unravel_index(index), which cannot take CUDA Tensor as input...
Since PyTorch does not have unravel_index() yet, either moving CUDA indices to CPU or writing a Torch Version of unravel_index() will fix this bug.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.