neurodatawithoutborders / matnwb Goto Github PK

View Code? Open in Web Editor NEW

49.0 49.0 32.0 28.69 MB

A Matlab interface for reading and writing NWB files

License: BSD 2-Clause "Simplified" License

MATLAB 11.57% M 0.01% Inno Setup 0.03% Python 0.12% Shell 0.10% HTML 88.17% Objective-C 0.01%

matnwb's People

Contributors

Stargazers

Watchers

matnwb's Issues

reading only section of data

It looks like the way to read data is to call DataStub.load e.g. file.acquisition.get('lfp').data.load, which reads the entire matrix. Is it possible to only read a section of the data? e.g. h5read

lazy loading of datasets

error reading NWB file written using pynwb that contains DynamicTable

file generated with python:

from pynwb import NWBFile, NWBHDF5IO
from datetime import datetime

nwbfile = NWBFile('source', ' ', ' ',
                  datetime.now(), datetime.now(),
                  institution='University of California, San Francisco',
                  lab='Chang Lab')
nwbfile.add_trial({'start': 0.0, 'end': 1.0})

with NWBHDF5IO('test_trials.nwb', 'w') as io:
    io.write(nwbfile)

generates a file you can download here

nwb=nwbRead('test_trials.nwb')

gives the following error:

Error using types.util.checkConstraint (line
15)
Property `tablecolumn.end` should be one of
type(s) { 'TableColumn' }.

Error in
types.util.checkSet>@(nm,val)types.util.checkConstraint(pname,nm,namedprops,constraints,val)
(line 10)
    @(nm, val)types.util.checkConstraint(pname,
    nm, namedprops, constraints, val));

Error in types.untyped.Set/validateAll (line
51)
                obj.fcn(mk, obj.map(mk));

Error in types.util.checkSet (line 11)
val.validateAll();

Error in
types.core.DynamicTable/validate_tablecolumn
(line 61)
        types.util.checkSet('tablecolumn',
        struct(), constrained, val);

Error in
types.core.DynamicTable/set.tablecolumn (line
46)
        obj.tablecolumn =
        obj.validate_tablecolumn(val);

Error in types.core.DynamicTable (line 39)
        obj.tablecolumn =
        types.util.parseConstrained('types.core.NWBDataInterface',
        'types.core.TableColumn', varargin{:});

Error in io.parseGroup (line 68)
    parsed = eval([typename '(kwargs{:})']);

Error in io.parseGroup (line 26)
    subg = io.parseGroup(filename, g_info);

Error in nwbRead (line 20)
nwb = io.parseGroup(filename, info);

without the line nwbfile.add_trial({'start': 0.0, 'end': 1.0}) in the python script, the matlab loading works as expected.

becb9b7

ensure dimension semantics are ordered

need to make sure shape/dim names are in the right order when you query them from matlab.

continuous integration

Now that we have the tests (99%) working, are there good options for setting up continuous integration? It would be great if we could have these tests run automatically on PRs so we can see if the merge would break any tests. Here's a MATLAB guide on setting it up using Jenkins. Is there a way of doing this for free?

tests failing

What's going on??

Add documentation

It'd be really helpful to those like me trying to read & understand how the package works to have (Matlab-style) documentation for each class and function. Particularly with the common use of abbreviations for variable & function names, and the lack of comments, it's quite hard to follow the logic!

Group constructor accepts struct but only functions correctly with StructMap

It is much less verbose to create structs in MATLAB than StructMaps, for example:

>> s.datasets.d1 = [1 2 3];

vs.

>> s = util.StructMap('datasets', util.StructMap('d1', [1 2 3]));

But it you pass the former into the Group constructor, it will accept it and fail later if you do a subsref or subsasgn:

>> s.datasets.d1 = [1 2 3];
>> g = types.untyped.Group(s);
>> g.d1
Reference to non-existent field 'map'.

Error in types.untyped.Group/findsubprop (line 124)
          if ~isempty(obj.(pn)) && isKey(obj.(pn).map, nm)

Error in types.untyped.Group/subsref (line 46)
          pn = findsubprop(obj, mainsub);

A good practice might be, rather than attempting to account for struct vs StructMap throughout the code, convert all structs to StructMaps when they are received from the user. Then the rest of the code can assume everything is a StructMap.

error on generateCore

generateCore('schema/core/nwb.namespace.yaml')

error:

The class file.Dataset has no Constant property or Static method named 'procdims'.

Error in file.procdims (line 15)
        [subsz, subnm] = file.Dataset.procdims(dimopt, shapeopt);

Error in file.Dataset (line 74)
                [obj.shape, obj.dimnames] = file.procdims(dims, shape);

Error in file.Group (line 105)
                    ds = file.Dataset(datasetiter.next());

Error in file.fillClass>processClass (line 120)
            class = file.Group(node);

Error in file.fillClass (line 6)
[processed, classprops, inherited] = processClass(name, namespace, pregen);

Error in file.writeNamespace (line 13)
        fwrite(fid, file.fillClass(k, namespace, pregenerated), 'char');

Error in generateCore (line 30)
file.writeNamespace(namespaceMap('core'));

cannot read new Units table

The schema has a new table called Units which cannot be read by matnwb.

nwbRead('test_units.nwb')
Error using types.util.checkDtype (line 58)
Property `id` must be a types.core.ElementIdentifiers.

Error in types.core.DynamicTable/validate_id (line 56)
        val = types.util.checkDtype('id',
        'types.core.ElementIdentifiers', val);

Error in types.core.DynamicTable/set.id (line 42)
        obj.id = obj.validate_id(val);

Error in types.core.DynamicTable (line 37)
        obj.id = p.Results.id;

Error in io.parseGroup (line 68)
    parsed = eval([typename '(kwargs{:})']);

Error in io.parseGroup (line 26)
    subg = io.parseGroup(filename, g_info);

Error in nwbRead (line 20)
nwb = io.parseGroup(filename, info);

example file

Generated class constructor should error when given an unknown argument

The generated (written by writeClass) class constructor currently ignores unknown key-value pairs. It should probably error (or at least warn) to alert the user:

>> ts = types.TimeSeries('unknown_key', 2);  % unknown_key is simply ignored

It is very easy to misspell a key and think you set the value but you did not:

>> ts = types.TimeSeries('decription', 2); % meant to set description but there is a type-o

Filtered test:

>> nwbtest('ProcedureName', 'testWriteClassConstructorThrowsOnUnknownArgument')

data compression

Is there any way to use matnwb to tell HDF5 to apply compression to a dataset?

Breaking it down, the following will be needed for implementation:

decide on user interface. My vote would be to mirror pynwb and create a data object that can hold information about compression parameters. Would DataStub work for this?
implement data compression when writing the data to the HDF5 file according to the parameters provided. See #50 (comment)
roundtrip tests.
documentation
- informative function comments
- including in a tutorial

Consider breaking apart larger parse and write functions into more testable units

The writeClass and parseClass functions are long and contain a lot of functionality, which makes targeted unit testing more difficult. There are private functions in these files that make the code more readable, but these functions cannot be called directly for testing. Might want to consider breaking apart these functions into smaller function files (writeDataset, writeAttribute, writeLink, etc.)

Hardcoded Class Modifications in Schema Are not Checked for

Main example being the trials Group under NWBFile.

- doc: 'Data about experimental trials'
    name: trials
    neurodata_type_inc: DynamicTable
    quantity: '?'
    datasets:
    - doc: The start time of each trial
      attributes:
        - name: description
          value: the start time of each trial
          dtype: text
          doc: Value is 'the start time of each trial'
      name: start
      neurodata_type_inc: TableColumn
      dtype: float
    - doc: The end time of each trial
      attributes:
        - name: description
          value: the end time of each trial
          dtype: text
          doc: Value is 'the end time of each trial'
      name: end
      neurodata_type_inc: TableColumn
      dtype: float

Expected behavior should be that NWBFile checks for the class along with any additions to the class itself.
At the current moment, only the class validation occurs.
This isn't necessarily breaking but validation is weaker, and this allows users to break from schema because matnwb never properly checks for these context-dependent values on export.

outputting links

in the code

dev = types.core.Device( ...
    'source', ns_path);
file.general_devices.set('dev1', dev);

eg = types.core.ElectrodeGroup('source', ns_path, ...
    'description', 'a test ElectrodeGroup', ...
    'location', 'unknown', ...
    'device', types.untyped.SoftLink('/general/devices/dev1'));
file.general_extracellular_ephys.set('elec1', eg);

I don't like that you have to know the path of dev1 in order to link it. I think that makes it hard for new users that might not know the hierarchical structure of nwb-schema. How would you feel about having file.general_devices.set optionally output the SoftLink object if an output is requested?

so the new syntax would be

dev = types.core.Device( ...
    'source', ns_path);
dev_link = file.general_devices.set('dev1', dev);

eg = types.core.ElectrodeGroup('source', ns_path, ...
    'description', 'a test ElectrodeGroup', ...
    'location', 'unknown', ...
    'device', dev_link);
file.general_extracellular_ephys.set('elec1', eg);

timeseries data loader

I made this for myself. Any interest in adding it to this project, or should I make this type of stuff in a separate repo? I think this would work nicely as a method of TimeSeries (e.g. LFP.load), but I'm not sure the best way to do that since it's generated code.

function data = loadTimeSeriesData(timeseries, interval, downsample_factor, electrode)
%LOADTIMESERIESDATA loads data within a time interval from a timeseries
%
%   DATA = loadTimeSeriesData(TIMESERIES, INTERVAL, DOWNSAMPLE_FACTOR)
%   TIMESERIES: matnwb TimeSeries object
%   INTERVAL: [start end] in seconds
%   DOWNSAMPLE_FACTOR: default = 1
%   ELECTRODE: detault = [] (all electrodes). Takes a 1-indexed integer,
%   (NOT AN ARRAY)
%   Works whether timestamps or starting_time & rate are stored. Assumes
%   timestamps are sorted in ascending order.

if ~exist('interval','var')
    interval = [0 Inf];
end

if ~exist('downsample_factor','var') || isempty(downsample_factor)
    downsample_factor = 1;
end

if ~exist('electrode','var')
    electrode = [];
end

dims = timeseries.data.dims;

if interval(1)
    if isempty(timeseries.starting_time)
        start_ind = fastsearch(timeseries.timestamps, interval(1), 1);
    else
        fs = timeseries.starting_time_rate;
        t0 = timeseries.starting_time;
        if interval(1) < t0
            error('interval bounds outside of time range');
        end
        start_ind = (interval(1) - t0) * fs;
    end
else
    start_ind = 1;
end

if isfinite(interval(2))

    if isempty(timeseries.starting_time)
        end_ind = fastsearch(timeseries.timestamps, interval(2), -1);
    else
        fs = timeseries.starting_time_rate;
        t0 = timeseries.starting_time;
        if interval(2) > (dims(1) * fs + t0)
            error('interval bounds outside of time range');
        end
        end_ind = (interval(2) - t0) * fs;
    end
else
    end_ind = Inf;
end

start = ones(1, length(dims));
start(end) = start_ind;

count = fliplr(dims);
count(end) = floor((end_ind - start_ind) / downsample_factor);

if ~isempty(electrode)
    start(end-1) = electrode;
    count(end-1) = 1;
end

if downsample_factor == 1
    data = timeseries.data.load(start, count)';
else
    stride = ones(1, length(dims));
    stride(end) = downsample_factor;
    data = timeseries.data.load(start, count, stride)';
end

how do you add electrodes to an electrode group?

Use char array instead of cellstr for text dtype

Why is cellstr used for text dtypes? It seems like a string would be more natural.

For example, why do I have to do this:

>> ts = types.TimeSeries;
>> ts.source = {'hello'}

Instead of this:

>> ts.source = 'hello'

UpdateThirdPartyFromUpstream.sh should be a MATLAB file

For OS compatibility's sake.

Cannot get value from field of StructMap array

Attempting to get the value of a field in a StructMap array produces an error:

Error in util.StructMap/subsref (line 37)
            o = obj.map(subv);

Failing test:

>> nwbtest('ProcedureName', 'testStructMapArrayDotSubsref')

getting linked data

I'm trying to access linked data, e.g. lfp.electrodes.data

which gives:

  RegionView with properties:

       path: '/general/extracellular_ephys/electrodes'
       view: [1×1 types.untyped.ObjectView]
     region: {[1 338]}
       type: 'H5T_STD_REF_DSETREG'
    reftype: 'H5R_DATASET_REGION'

but I don't see any methods to actually follow that link to the data. Is there a convenient way to do this?

error when trying to generate core

generateCore('schema/core/nwb.namespace.yaml');

Error using save
Cannot create 'core.mat' because 'namespaces' does not exist.

Error in generateCore (line 23)
save(fullfile('namespaces','core.mat'), '-struct', 'cs');

you can fix this by creating a namespaces directory

Warnings and errors should have IDs

Assigning IDs to warnings makes it possible to suppress them without suppressing all warnings. Assigning IDs to errors makes it easier to distinguish between errors thrown by the code. Not having error IDs, I cannot easily test, for instance, if a particular action causes a particular error.

Some of the warnings and errors in the code already have something that looks like an ID, but its in the message:

error('Group:subsasgn: please specify whether this numeric value is in ''attributes'' or ''datasets''')

These should be pulled out of the message and assigned to the message ID:

error('Group:subsasgn',  'Please specify whether this numeric value is in ''attributes'' or ''datasets''')

Or a warning/error ID naming strategy should be defined for the project.

Matrices are transposed on round trip to disk

Datasets are transposed after being written to disk by nwbExport and read back by nwbRead:

    Actual Value:
         0     1     2     3     4     5     6     7     8     9
    Expected Value:
         0
         1
         2
         3
         4
         5
         6
         7
         8
         9

I believe this is because nwbExport transposes the matrices but nwbRead does not transpose them back.

Failing test:

>> nwbtest('Name', 'tests.system.TimeSeriesIOTest/testRoundTrip')

Attribute.value should be a constant value

The spec says the Attribute.value key should be assigned a constant value for an attribute (see http://schema-language.readthedocs.io/en/latest/specification_language_description.html#value ). However, it is mapped as only a default value for a property that can be changed.

Failing test:

>> nwbtest('ProcedureName', 'testWriteAttributeWithValue')

Table Validators Don't Inherit Parent Columns

Doesn't Break IO but will allow table columns not matching the schema.

Groups share state

All Groups created share the same StructMaps for property values. This causes a lot of strange behavior, for example:

>> g1 = types.untyped.Group();
>> g1.g2 = types.untyped.Group();
>> g3 = types.untyped.Group()

g3 = 

  Group with properties:

    attributes: {}
      datasets: {}
         links: {}
        groups: {'g2'}
       classes: {}

See: https://www.mathworks.com/help/matlab/matlab_oop/properties-containing-objects.html. Properties assigned object default values should be assigned in the constructor (instead of in the property block).

Failing tests:

>> nwbtest('ProcedureName', 'testGroupConstructorDoesNotCarryState')

Dataset with text dtype and null shape causes error with validation function

When a dataset has a text dtype and a multi-dimensional shape with a null dimension, the validation function will error when the generated class is instantiated.

For example, ElectrodeGroup has a channel_coordinates dataset with the following spec:

    name: channel_coordinates
    shape:
    - null
    - 3

Trying to instantiate the generated class:

>> types.ElectrodeGroup
Error using types.ElectrodeGroup/validate_channel_coordinates (line 147)
ElectrodeGroup.channel_coordinates: val must have shape [~,3]

Error in types.ElectrodeGroup/set.channel_coordinates (line 62)
      obj.channel_coordinates = validate_channel_coordinates(obj, val);

Error in types.ElectrodeGroup (line 54)
          obj.(field) = p.Results.(field);

Failing test:

>> nwbtest('ProcedureName', 'testWriteDatasetWithNullShape')

Do not write datasets to HDF5 if they are unassigned

PyNWB only writes datasets to HDF5 if they have been set. MatNWB writes all datasets to disk even if they have not been assigned/set by the user or a default value. I'm not sure if this discrepancy will cause issues in the future but it is inefficient to create these empty datasets and it means you have a bunch of empty datasets crowding up the HDF5 file (not great when viewed in HDFView for instance).

Improve tab completion if possible

Tab completion is an important part of discovery (for me at least). However, with all the subsref and properties trickiness, tab completion no longer works when you're navigating deeply within a hierarchy:

>> f = nwbfile;
>> ts = types.TimeSeries;
>> f.acquisition.timeseries = types.untyped.Group;
>> f.acquisition.timeseries.test_timeseries = ts;
>> f.acquisition.timeseries.  % Tab completion says "No Completions Found" here

It would be great if this worked. It may not be possible but maybe someone can investigate it.

error reading NWB file with cached spec

There's a new feature in pynwb that caches the specs used to generate the file within the file itself. Here's an example:

from pynwb import NWBFile, NWBHDF5IO
from datetime import datetime

nwbfile = NWBFile('source', ' ', ' ',
                  datetime.now(), datetime.now(),
                  institution='University of California, San Francisco',
                  lab='Chang Lab')

with NWBHDF5IO('test_cache.nwb', 'w') as io:
    io.write(nwbfile, cache_spec=True)

Here is the produced file.

If only the core is used it caches the text of core.namespace.yaml. If extensions are used, it also stores those. This allows users to automatically load data even if the data is defined by an external spec and allows the extensions so be passed with the data without worrying about versions of extensions, etc.

Would it be possible to read these specs and automatically generate the defined classes?

Here's an example with an extension:

from pynwb.spec import NWBDatasetSpec, NWBNamespaceBuilder, NWBGroupSpec

name = 'test'
ns_path = name + '.namespace.yaml'
ext_source = name + '.extensions.yaml'

test_obj = NWBGroupSpec(
    neurodata_type_def='TestObj',
    doc='test object',
    datasets=[NWBDatasetSpec(name='filler_data', doc='data doc', dtype='int',
                             shape=(None, 1))],
    neurodata_type_inc='NWBDataInterface')

ns_builder = NWBNamespaceBuilder(name + ' extensions', name)
ns_builder.add_spec(ext_source, test_obj)
ns_builder.export(ns_path)



from datetime import datetime

from pynwb import load_namespaces, register_class, NWBFile, NWBHDF5IO
from pynwb.core import NWBDataInterface
from pynwb.form.utils import docval, popargs
from collections import Iterable


# load custom classes
load_namespaces(ns_path)


@register_class('TestObj', 'test')
class TestObj(NWBDataInterface):
    __nwbfields__ = ('filler_data',)

    @docval({'name': 'name', 'type': str, 'doc': 'name'},
            {'name': 'source', 'type': str, 'doc': 'source?'},
            {'name': 'filler_data', 'type': Iterable, 'doc': 'data'})
    def __init__(self, **kwargs):
        name, source, filler_data = popargs('name', 'source', 'filler_data',
                                            kwargs)
        super(TestObj, self).__init__(name, source, **kwargs)
        self.filler_data = filler_data


nwbfile = NWBFile(source='0',
                  session_description='0',
                  identifier='0',
                  session_start_time=datetime(1900, 1, 1))

nwbfile.add_acquisition(TestObj(name='name', source='source', filler_data=[1]))

with NWBHDF5IO('test_extension.nwb', 'w') as io:
    io.write(nwbfile, cache_spec=True)

Here is the generated file.

Schema should be a submodule

The schema files have currently been copied and pasted into the MatNWB repository. A better approach may be to use a submodule that points to the schema repo. I can think of some pros/cons for both approaches, so this change is not a definite win but something to think about. Submodule pros: avoids copy/paste mistakes, makes the source of the schema files clear (as well as git rev). Cons: confusing for new users cloning the repo (need to use --recursive flag to get submodules)

missing Schema function (?) when running generateCore

Hi matnwb experts,

sorry if this is a naive question.

I tried importing a NWB dataset to matlab for the first time. I ran the command "generateCore" and received an error (pasted below). I also tried to run it using the nwb.namespace.yaml from pynwb.

Am I missing a Schema function?

I ran it on Matlab R2017b and using matnwb code updated 6/6/18 from github.

Thank you,
Stephan

generateCore('/pathto/matnwb/schema/core/nwb.namespace.yaml');

Warning: Invalid file or directory '/pathto/pynwb/src/pynwb/jar/schema.jar'.

In javaclasspath>local_validate_dynamic_path (line 271)
In javaclasspath>local_javapath (line 187)
In javaclasspath (line 124)
In javaaddpath (line 71)
In util.generateSchema (line 2)
In generateCore (line 22)
Undefined function or variable 'Schema'.

Error in yaml.getSourceInfo (line 2)
schema = Schema();

Error in util.generateSchema (line 5)
schema = yaml.getSourceInfo(localpath, filenames{:});

Error in generateCore (line 22)
cs = util.generateSchema(core);

Cannot explicitly assign a value to a true property of a Group

Attempting to explicitly assign a value to a "true" property of a group throws an error:

>> g1 = types.untyped.Group();
>> g2 = types.untyped.Group();
>> g1.groups.g2 = g2
Error using containers.Map/subsref
The specified key is not present in this container.

Error in util.StructMap/subsasgn (line 59)
              tempobj = obj.map(subv);

Error in types.untyped.Group/subsasgn (line 77)
          obj = builtin('subsasgn', obj, s, r);

I believe this is important because if a Group cannot distinguish between an attribute and a dataset, you need a way to explicitly tell it which to use.

>> g1.a1 = [1 2 3]
Error using types.untyped.Group/subsasgn (line 69)
Group:subsasgn: please specify whether this numeric value is in 'attributes' or 'datasets'

Failing test:

>> nwbtest('ProcedureName', 'testGroupDotSubsasgnGroup')

nwb_version should auto-populate

Would it be possible to make it so the variable nwbfile.nwb_version auto-populated?

cannot read UnitTimes

I am having trouble reading the UnitTimes datatype. Specifically, when I try to import data with UnitTimes, I get the following error:

>> nwb=nwbRead('/Users/bendichter/Desktop/Buzsaki/SenzaiBuzsaki2017/test.nwb');

Error using hdf5lib2
Incorrect number of uint8 values passed in
reference parameter.

Error in H5R.get_name (line 34)
name = H5ML.hdf5lib2('H5Rget_name', loc_id,
ref_type, ref, useUtf8);

Error in io.parseReference (line 7)
target = H5R.get_name(did, reftype, data);

Error in io.parseDataset (line 24)
    data = io.parseReference(did, tid,
    H5D.read(did));

Error in io.parseGroup (line 14)
    ds = io.parseDataset(filename, ds_info,
    fp);

Error in io.parseGroup (line 26)
    subg = io.parseGroup(filename, g_info);

Error in io.parseGroup (line 26)
    subg = io.parseGroup(filename, g_info);

Error in io.parseGroup (line 26)
    subg = io.parseGroup(filename, g_info);

Error in nwbRead (line 20)
nwb = io.parseGroup(filename, info);

I have tracked this down to an error building the spike_times_index component of UnitTimes, which is a list of range references. This is data written using pynwb. Could this be due to the switch of dimensions mentioned in the README? I'll try to replicate the error using a file written in matnwb as well.

Inherited dataset where the dtype is redefined to be more specific dtype causes error with validation function

When a dataset is inherited and its dtype is "any", then the inheriting type redefines it as something more specific, like "text", this causes the generated class to error on instantiation because the default value is assigned per the dtype of the superclass dataset but the validation function is based on the dtype of the dataset in the actual class.

For example, AnnotationSeries inherits from TimeSeries. TimeSeries.data has a dtype of "any". AnnotationSeries.data has a dtype of "text". The generated types.AnnotationSeries fails during construction:

>> types.AnnotationSeries
Error using types.AnnotationSeries/validate_data (line 64)
AnnotationSeries: data must be a cell string

Error in types.TimeSeries/set.data (line 99)
      obj.data = validate_data(obj, val);

Error in types.TimeSeries (line 71)
            obj.(field) = p.Results.(field);

Error in types.AnnotationSeries (line 32)
      obj = [email protected](varargin{:});

Failing test:

>> nwbtest('ProcedureName', 'testSmokeInstantiateCore')

matnwb does not support spaces in names

Since names are used as field names of objects, matnwb does not allow them to have spaces. pynwb does allow spaces. Maybe a compromise would be to automatically replace all spaces with underscores in matnwb.

generating extension results in infinite recursion

>> generateExtension('/Users/bendichter/dev/to_nwb/to_nwb/extensions/ecog/ecog.namespace.yaml');



Out of memory. The likely cause is an infinite recursion
within the program.

files are here: https://github.com/bendichter/to_nwb/tree/master/to_nwb/extensions/ecog

round trip fails for TimeSeries data

file = nwbfile( ...
    'source', 'sdf', ...
    'session_description', 'a test NWB File', ...
    'identifier', 'sdf', ...
    'session_start_time', datestr(now, 'yyyy-mm-dd HH:MM:SS'), ...
    'file_create_date', datestr(now, 'yyyy-mm-dd HH:MM:SS'));


ts = types.core.TimeSeries('source', 'source',...
    'starting_time',0,...
    'starting_time_rate',3,...
    'data',[1,2,3],...
    'data_units','V?');
file.acquisition.set('test', ts);

nwb_path = 'test.nwb';
nwbExport(file, nwb_path)
nwb_read = nwbRead(nwb_path);

Undefined function 'keys' for input arguments of type 'types.untyped.DataStub'.

Error in io.parseGroup>elide (line 77)
    ekeys = keys(set);

Error in io.parseGroup (line 58)
        props(etp) = elide(etp, props);

Error in io.parseGroup (line 26)
    subg = io.parseGroup(filename, g_info);

Error in io.parseGroup (line 26)
    subg = io.parseGroup(filename, g_info);

Error in nwbRead (line 20)
nwb = io.parseGroup(filename, info);

Error in TDT2NWB (line 146)
nwb_read = nwbRead(nwb_path);

Interestingly this is not an issue for derived classes like ElectricalSeries. I may be entering something wrong here, but it would be better if errors occurred on write rather than on read.

data type imcompatibility

I updated the syntax on the python tests and now most of the errors are minor data type equality errors, e.g int32 -> int64.

Verification failed in tests.system.TimeSeriesIOTest/testInFromPyNWB.

    ----------------
    Test Diagnostic:
    ----------------
    Values for property 'data' are not equal

    ---------------------
    Framework Diagnostic:
    ---------------------
    verifyEqual failed.
    --> Classes do not match.
        
        Actual Class:
            int64
        Expected Class:
            int32
    
    Actual Value:
      10×1 int64 column vector
    
       100
       110
       120
       130
       140
       150
       160
       170
       180
       190
    Expected Value:
      10×1 int32 column vector
    
       100
       110
       120
       130
       140
       150
       160
       170
       180
       190

NWBFile.nwb_version not assigned

I would expect NWBFile.nwb_version to be assigned by MatNWB and not the user. Right now this is left blank unless the user specifies it. PyNWB assigns this version.

Failing test:

>> nwbtest('Name', 'tests.system.NWBFileIOTest/testInFromPyNWB')

Trial data support / Catch up with schema

The following pull requests on the nwb-schema (and PyNWB repo respectively) added new neurodata_types to the schema to improve support for trial data.

NeurodataWithoutBorders/pynwb#536
NeurodataWithoutBorders/nwb-schema#173

In particular the DynamicTable and TableColumn type are new. DynamicTable is essentially a column-based table consisting of an arbitrary number of TableColumn. In this way users can add arbitrary metadata columns for trial data without having to write extensions. It also adds a new group trials for storing trial data.

There are no new features in terms of the schema-langue, i.e., these are changes to only the schema itself. It would be great if we could catch up MatNWB to the schema as the support for trial data was one of the central needs identified during the hackathons.

@nclack @ln-vidrio

Dataset with multiple shapes with null dimensions cannot be assigned 2d matrix

ElectricalSeries.data is defined by the spec as:

    name: data
    shape:
    - - null
    - - null
      - null

Attempting to assign a 2d matrix to data on the generated class causes an error:

>> es = types.ElectricalSeries;
>> es.data = ones(2)
Error using types.ElectricalSeries/validate_data (line 65)
ElectricalSeries.data: val must have [1,2] dimensions

Error in types.TimeSeries/set.data (line 99)
      obj.data = validate_data(obj, val);

Failing test:

>> nwbtest('ProcedureName', 'testWriteDatasetWithMultipleNullShapes')

problem loading large data matrix

I am trying to load a file with a large LFP data block. The shape is saved as 50461375x80 and type int16. When I try to run nwbRead I run into several problems. The error I receive is

Error using double
Requested 4036910000x1 (30.1GB) array exceeds maximum array
size preference. Creation of arrays greater than this limit
may take a long time and cause MATLAB to become unresponsive.
See array size limit or preference panel for more
information.

Error in types.util.checkDtype (line 40)
        val = eval([type '(val)']);

Error in types.core.ElectricalSeries/validate_data (line 33)
        val = types.util.checkDtype('data', 'double', val);

Error in types.core.TimeSeries/set.data (line 96)
        obj.data = obj.validate_data(val);

Error in types.core.TimeSeries (line 72)
        obj.data = p.Results.data;

Error in types.core.ElectricalSeries (line 17)
        obj = [email protected](varargin{:});

Error in io.parseGroup (line 68)
    parsed = eval([typename '(kwargs{:})']);

Error in io.parseGroup (line 26)
    subg = io.parseGroup(filename, g_info);

Error in io.parseGroup (line 26)
    subg = io.parseGroup(filename, g_info);

Error in io.parseGroup (line 26)
    subg = io.parseGroup(filename, g_info);

Error in nwbRead (line 20)
nwb = io.parseGroup(filename, info);

The data should be read passively and it appears that this is not, causing RAM overload.
The data should be int16 and it appears that this is being cast as a double
It looks like the shape might be missing and the array flattened. The error describes the data as 4036910000x1, but it should be 50461375x80.

With a little keyboard mode exploration I found g_info.Datasets(1).Datatype.Class: 'H5T_INTEGER' and .Type: 'H5T_STD_I16LE', which looks right to me, so matnwb appears to be reading this information correctly, but maybe not applying it. If the strategy is to import as a double and then recast to the desired type, I think that's going to continue to cause RAM issues and should probably be refactored. g_info.Datasets(1).Dataspace.Size: [80 50461375], which I think should be transposed. NWB is pretty particular about the time dimension always being first.

The file I am trying to import is quite large (6 GB), and I think you might be able to investigate these issues without it, but if you'd like it, let me know the best way to share it with you.

becb9b7

Assigning a vector to a dataset with multiple shapes throws an error

Datasets that have multiple possible shapes including a vector shape cannot be assigned vector values:

>> mc = testpack.TestClass(); 
>> mc.testDataset = 1;  % testDataset should accept a vector shape but it won't allow it
Error using testpack.TestClass/validate_testDataset (line 47)
TestClass.testDataset: val must have shape [2,3]

Error in testpack.TestClass/set.testDataset (line 26)
      obj.testDataset = validate_testDataset(obj, val);

This issue may be related to the use of ndims in the validation function. ndims will always return a value greater than or equal to 2, per https://www.mathworks.com/help/matlab/ref/ndims.html. The validation function assumes ndims will return 1 on a scalar value.

Failing test:

>> nwbtest('ProcedureName', 'testWriteDatasetWithMultipleShapes')

Group constructor should error when given an unknown argument

The Group constructor currently ignores unknown fields in the given struct. It should probably error (or at least warn) to alert the user:

>> s.unknownField = [1 2 3];
>> g = types.untyped.Group(s)

g = 

  Group with properties:

    attributes: {}
      datasets: {}
         links: {}
        groups: {}
       classes: {}

Filtered test:

>> nwbtest('ProcedureName', 'testGroupConstructorThrowsOnUnknownArgument')

MatNWB/PyNWB file incompatibility

Many of the tests writing a file in MatNWB and trying to open it in PyNWB fail. Most appear to fail because MatNWB writes empty datasets when the datasets have not been assigned/set by the user or a default value. PyNWB does not expect these empty datasets. Not sure if that’s considered a PyNWB or MatNWB issue.

Failing tests:

>> nwbtest('Name', 'tests.system.*/*PyNWB')

trouble reading IntervalSeries written by pynwb

I believe this is an issue with pynwb not following spec strictly enough: NeurodataWithoutBorders/pynwb#592. Creating an issue here for record-keeping

Parsed dtype for attributes and datasets are incomplete and some are unexpected

Some of the attribute and dataset dtypes listed in the schema spec do not appear to be handled properly. For instance, I would expect ascii, utf, utf8, utf-8 to be parsed as string like text but they are left as is. Others are handled in unexpected ways (float is parsed as double instead of single).

Failing tests:

>> nwbtest('ProcedureName', 'testParseAttributeWithDtype')
>> nwbtest('ProcedureName', 'testParseDatasetWithDtype')

neurodatawithoutborders / matnwb Goto Github PK

matnwb's People

Contributors

Stargazers

Watchers

Forkers

matnwb's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs