DataTensors are too slow. We need to rethink them.
I would like them to be used as modules:
- starts with contiguous()
- When a type-casting is required, use nn.Copy()
- When a reshape is required, use nn.Reshape()
- When a transpose is required, use nn.Transpose()
A model requests a view and a tensor type.
A DataTensor has backward(view, (gradInput|type)) and forward(view, [input|type]) methods.
The view is a string specifying the view of the Space : 'bf', 'bhwc', 'chwb', 'bhf', etc.
When a gradInput/input tensor is provided, it will be stored in cache with key (view,type).
When backward/forward methods are called without gradInput/input tensors, then a tensor of the requested view and type is provided.
Internally, the a tensor must be efficiently converted to different views.
The root view is the one provided when backward/forward is called with a tensor. It is automatically made contiguous (with parallel or the like, this can lead to a speedup). Then it is stored at self.input/gradInput, and in the tensor_cache with the correct view and type.
When a backward/forward is called without a tensor (with a type), we look for the view and type in the tensor_cache. If found, it is returned. Else, we look for the (view, type) pair in the module_cache. If found, we forward/backward through it. Else, we need to build a module that will transform the root view to the requested view.
After each doneBatch, the tensor_cache must be emptied.
Like a module, the DataTensor is constructed once as the output of a Model or DataSet.