GithubHelp home page GithubHelp logo

Comments (6)

nathanielsimard avatar nathanielsimard commented on May 15, 2024 2

I thought about the problem and came up with en even better potential solution!

  1. Each operation would receive owned input tensors, which would allow them to reuse data storage or buffer for performance improvement. However, they would also need to handle shared data structures.
  2. Tensor would no longer implement Clone, but would have to implement share instead. This would involve creating a new method for creating shared references to tensor data.
pub trait TensorOps<B> {
    ...
    fn share<const D: usize>(tensor: &mut B::TensorPrimitive<D> ) -> B::TensorPrimitive<D>;  
}

Backends can implement this with a simple clone if they want, but they can also change the datastore to a shared one.

struct MyTensorPrimitive {
   ...
   storage: MyTensorStorage,
}

enum MyTensorStorage {
    Owned(Storage),
    Shared(Arc<Storage>),
}

The share function implementation modifies the inner storage by making it immutable in an Arc reference. This allows backends to have more flexibility to reuse existing buffers without increasing the number of functions they need to implement.

I don't see any drawback to this solution. It does not increase the size of the API in the Backend trait or the Tensor struct, does not require graph analysis for performance improvement, and even allows for partial mutability in the API (the left-hand side Tensor may be shared, but the right-hand side may not, allowing for even more optimization opportunities). It also provides room for better documentation, as we can add custom documentation to the share method but not the Clone trait.

from burn.

nathanielsimard avatar nathanielsimard commented on May 15, 2024 1

Thanks for the proposal. I'd like to highlight the pros and cons of having mutable operations.

Pros:

  • Potentially increase performance, particularly during inference rather than training. This is because tensors often need to be reused in the backward pass during training, which requires an immutable API or frequent cloning.

Cons:

  • Increase the size of the backend API
  • May increase the userland Tensor API
    • Decrease developer experience by requiring them to choose between the mutable and immutable versions of an operation.

I have two potential solutions in mind:

  1. Allow backends to implement mutable operations (with default implementations provided). However, I would not include these mutable operations in the userland Tensor API. Instead, a lazy decorator backend could analyze the computational graph and use these mutable operations internally. One potential issue with this approach is that the decorator backend would need to handle dynamic partial graphs, which may make it difficult to know for certain if a tensor will never be used.

  2. Another way to allow mutating tensor in the backend is to change the API so that each operation takes ownership of each input tensor. Each backend could then handle the reusability of tensor data in their clone implementation of the tensor primitive. This solution is simpler, but it's not clear how we could provide more information to backends to help them know when to share storage or reuse and modify it.

Maybe both solutions could be combined in a way that simplifies the decorator backend's analysis of graphs, using explicit clone calls to provide lifetime information.

from burn.

antimora avatar antimora commented on May 15, 2024

+1 on improving inference performance.

from burn.

antimora avatar antimora commented on May 15, 2024

I came across clone_from method that could be memory efficient: https://doc.rust-lang.org/nightly/core/clone/trait.Clone.html#method.clone_from

from burn.

antimora avatar antimora commented on May 15, 2024

@nathanielsimard You worked on this. Is this ticket complete?

from burn.

nathanielsimard avatar nathanielsimard commented on May 15, 2024

Yes it's completed.

from burn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.