GithubHelp home page GithubHelp logo

classicvalues / nestedtensor Goto Github PK

View Code? Open in Web Editor NEW

This project forked from pytorch/nestedtensor

0.0 1.0 0.0 13.86 MB

[Prototype] Tools for the concurrent manipulation of variably sized Tensors.

License: BSD 3-Clause "New" or "Revised" License

Python 11.41% Shell 0.17% C++ 9.25% Jupyter Notebook 76.44% Cuda 2.74%

nestedtensor's Introduction

The nestedtensor package prototype

If you are here because you ran into a runtime error due to a missing feature or some kind of bug, please open an issue and fill in the appropiate template. If you have general feedback about this prototype you can use our suggested template or just open a free-form issue if you like. Thank you for contributing to this project!

Tutorials

If you are new to this project, we recommend you take a look at our whirlwind introduction to get started.

Autograd support

Due to missing extensibility features of PyTorch nestedtensor currently lacks autograd support. We're actively working on this and recognize that it severely limits the applicability of the project. Please run nestedtensor operations within the inference mode context to prevent any adverse interactions with the autograd system.

For example

sentences = [torch.randn(10, 5), torch.randn(5, 5), torch.randn(9, 5)]
with torch.inference_mode():    
    nt = nestedtensor.nested_tensor(sentences)
    nt.sum(1)

Binaries

Due to the development velocity of PyTorch the nestedtensor project is built on top of and dependent on a fixed, recent PyTorch nightly.

Version Python CUDA Wheels
0.1.1 3.6 CPU-only nestedtensor
0.1.1 3.7 CPU-only nestedtensor
0.1.1 3.8 CPU-only nestedtensor
0.1.1 3.6 CUDA 10.2 nestedtensor
0.1.1 3.7 CUDA 10.2 nestedtensor
0.1.1 3.8 CUDA 10.2 nestedtensor

When installing a binary please specify the corresponding torch nightly link archive to automatically pull in the correct PyTorch nightly.

CPU

pip install https://download.pytorch.org/nestedtensor/whl/nightly/cpu/py3.7/nestedtensor-0.1.1_cpu-cp37-cp37m-linux_x86_64.whl -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html

CUDA 10.2

pip install https://download.pytorch.org/nestedtensor/whl/nightly/cu102/py3.7/nestedtensor-0.1.1_cu102-cp37-cp37m-linux_x86_64.whl -f https://download.pytorch.org/whl/nightly/cu102/torch_nightly.html

Why consider using this? / Dealing with dynamic shapes

In general we batch data for efficiency, but usually batched kernels need, or greatly benefit from, regular, statically-shaped data.

One way of dealing with dynamic shapes then, is via padding and masking. Various projects construct masks that, together with a data Tensor, are used as a representation for lists of dynamically shaped Tensors.

Obviously this is inefficient from a memory and compute perspective if the Tensors within this list are sufficiently diverse.

You can also trace through the codebase where these masks are used and observe the kind of code this approach often leads to. See for example universal_sentence_embedding.

Otherwise we also have one-off operator support in PyTorch that aims to support dynamic shapes via extra arguments such as a padding index. Of course, while these functions are fast and sometimes memory efficient, they don't provide a consistent interface.

Other users simply gave up and started writing for-loops, or discovered that batching didn't help.

We want to have a single abstraction that is consistent, fast, memory efficient and readable and the nestedtensor project aims to provide that.

How does nestedtensor help here?

NestedTensors are a generalization of torch Tensors which eases working with data of different shapes and lengths. In a nutshell, Tensors have scalar entries (e.g. floats) and NestedTensors have Tensor entries. However, note that a NestedTensor is still a Tensor. That means it needs to have a single dimension, single dtype, single device and single layout.

Tensor entry constraints:

  • Each Tensor constituent is of the dtype, layout and device of the containing NestedTensor.
  • The dimension of a constituent Tensor must be less than the dimension of the NestedTensor.
  • An empty NestedTensor is of dimension zero.

Prototype classification

The nestedtensor package is a prototype intended for early stage feedback and testing. It is on the road to a beta classification, but there is no definitive timeline yet. See PyTorch feature classification for what prototype, beta and stale means.

Dependencies

  • pytorch (installed from nestedtensor/third_party/pytorch submodule)
  • torchvision (needed for examples and tests)
  • ipython (needed for examples)
  • notebook (needed for examples)

Contribution

The project is under active development. If you have a suggestions or found a bug, please file an issue!

nestedtensor's People

Contributors

cpuhrsch avatar izdeby avatar seemethere avatar facebook-github-bot avatar malfet avatar xuzhao9 avatar justanhduc avatar samuelmarks avatar stas00 avatar ebetica avatar anjali411 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.