bids-standard / bids-2-devel Goto Github PK

Discussions and suggestions of backwards incompatible changes to BIDS

Home Page: https://bids.neuroimaging.io/

License: Creative Commons Attribution 4.0 International

bids-2-devel's Introduction

bids-2-devel

This repository contains discussions and suggestions of changes to BIDS that could not be implemented in BIDS 1.x due to backwards compatibility.

There is now a BIDS 2.0 project which plans to address some of those issues and implement a BIDS 2.0 candidate. Overarching agenda for BIDS 2.0 which guides selection of issues is presented in the issue 57. Work to address those issues is targetting the PR #1775 against bids-specification. See the PR's description for more information.

If you have a suggestion, please go ahead and:

Review BIDS 2.0 project
Search the issues to see whether your suggestion has been discussed before or
Open a new issue otherwise
Vote for issues via 👍and 👎 on the issues. Dashboard of issues sorted by thumb-up

If you would like to join the effort and address any of the issues, follow up on the issue and prepare a PR against bids-2.0 branch of bids-specification following the procedures outlined in PR #1775.

bids-2-devel's People

Contributors

Stargazers

Watchers

Forkers

tsalo

bids-2-devel's Issues

Addition of scan length in the JSON sidecar

https://groups.google.com/d/msg/bids-discussion/lIH_JZsQ55o/VN-YWbqgBgAJ

I know there's been a lot of discussion about various acquisition timing parameters and how to store them in json sidecars, but one thing that I don't think has come up (though I might have missed it) is how to represent the total length of each scan. It would be nice to define some field like RunLength (expressed in seconds) that lives alongside RepetitionTime. Or alternatively, NumberOfVolumes, which is equally good assuming a constant TR. Personally I would push for making this information mandatory, but at the very least it should be part of the (optional) controlled vocabulary.

I recognize that a field like this is rarely strictly necessary, as the scan duration can usually be computed from the RepetitionTime and information in the image header. But it's quite impractical to require access to potentially very large image files just to access a key piece of information. And unlike some of the other fields in the header, this one is often necessary in contexts that don't involve processing or analysis of the actual images. For example, one should really be able to construct design matrices for all runs in a BIDS project from just events.tsv files, RepetitionTime, and RunLength. It doesn't make sense to me that, e.g., computing collinearity diagnostics for an experimental design should require downloading hundreds of GBs worth of images.

Original authors: @tyarkoni

Enhancing / harmonizing dataset description

opening this issue to discuss if we can harmonize the dataset_description schema with the dandiset schema, which amongst several things, allows for all kinds of linked resources and better descriptors of contributors. it would still be bids 2.0.

example: https://api.dandiarchive.org/api/dandisets/000055/versions/draft/

these are all entered through a metadata editor ui, so people who don't know json can fill in things.

also see https://api.dandiarchive.org/api/dandisets/000008/versions/draft/ which has instances (see the relatedResource section) of the dataset being extended by data in two other archives.

the schema is also not just json compliant (and has a json schema attached) but also jsonld.

schemas are released here: https://github.com/dandi/schema/tree/master/releases

Extended BIDS for animals

Could we have sub- changed to the species e.g. in http://datashare.is.ed.ac.uk/handle/10283/2122 we used rat instead of sub (rat- mice- cat- dog- there is more and more stuff out there now)

when using mutant we could either have it in the participant description or right away in the name, e.g. rat-SD-Bdnftm1sage_seq-SE_task_flashing-light_bold.nii.gz

Original authors: Unknown

Exchange "-" with "_" and vice versa

Stems from bids-standard/pybids#18.

Currently a filename in BIDS would look like: sub-03_ses-audio_run-08_bold.nii.gz. The delimiters are such that _ separates categories and - separates category from its value.

When debugging it is often desirable to scroll fast and systematically through chunks of text at a time instead of hitting the left arrow key and having to scroll through each letter. On my mac I use option+arrow_key.

The issue that I want to raise is that the way mac keyboards parse text is it jumps through _ and stops at the -. For that reason I'd like to propose a change to the standard.

from: sub-03_ses-audio_run-08_bold.nii.gz
to: sub_03-ses_audio-run_08-bold.nii

This makes scrolling through filenames and paths more attuned to the way keyboards parse text when scrolling and therefore easier to select copy paste chunks that belong together.

Original authors: @andrebeu

Make key/value pair: _run-XX suffix mandatory

Currently the key/value pair: _run-01, _run-02 suffix can be optionally omitted.

It would be good we me make this mandatory it would be better for long-term data collection and ease adding to data sets at a later point. For instance, you might start a study thinking you will only have 1 run and choose to omit the run suffix. Later on, your study design could change you will have multiple runs. In this situation you would have to go back and rename all of you files to include the suffix. Moreover, if this dataset has already been imported into a database (i.e LORIS), it will be ever harder to manage this.

Original authors: Unknown

Change handling of field mapping information

Relates to #39 here, bids-standard/bids-specification#622 in BIDS 1.0, maybe others also. Had hoped to discuss at OHBM22 but didn't get the chance.

I really dislike having information regarding how an inhomogeneity field should be estimated ("B0FieldSource") and the images to which that estimate should be applied ("IntendedFor") encoded alongside image data sidecar information. The contents of those sidecars provide information about the image data themselves, how they were acquired, acquisition parameters, etc., whereas these fields relate specifically to how those data should be processed. We are currently bastardising the sidecar information that is specific to one image data file with content that relates to other image data files and the desired utilisation of such.

I would vastly prefer a solution where information relating to how a dataset should be processed is independent of the raw data themselves. The image sidecar JSONs that come out of eg. dcm2niix should remain untouched. Instead there should be a separate JSON file that is specifically intended to provide information about the intended utilisation of specific images.

In the specific context of B0 inhomogeneity field map estimates, one may theoretically have any number of field map estimates, which may come from different sources. Eg. If acquiring reversed phase encoding spin-echoes, and a subject moves a lot during the session, you may want to derive multiple field maps throughout the session, and for each fMRI run use only the most temporally proximal estimate. So that might look like (focusing more on concept than field names):

$ cat processing.json
{
    "B0FieldMaps": [
        "early": {
            "Sources": [
                "fmap/sub-###_dir-ap_run-1_epi.nii.gz",
                "fmap/sub-###_dir-pa_run-1_epi.nii.gz"
            ],
            "Sinks": [
                "func/sub-###_task-nback_bold.nii.gz"
                "func/sub-###_task-olr_bold.nii.gz"
            ]
        },
        "late": {
            "Sources": [
                "fmap/sub-###_dir-ap_run-2_epi.nii.gz",
                "fmap/sub-###_dir-ap_run-3_epi.nii.gz",
                "fmap/sub-###_dir-pa_run-2_epi.nii.gz"
                "fmap/sub-###_dir-pa_run-3_epi.nii.gz"
            ],
            "Sinks": [
                "func/sub-###_task-rest_bold.nii.gz"
            ]
        }
    ]
}

Theoretically, this concept of defining a text file that prescribes the intended image processing and inter-dependencies between data may not necessarily be constrained to just B0 inhomogeneity field mapping; hence the "B0FieldMaps" list at the root level. Curious to know if anyone with a wider breadth of experience can come up with more examples of such.

TL;DR: Separate processing instructions from raw data.

Phenotypes

As a general (and easy) suggestion, it would be great to start collating a repository of standard phenotypic measure .json files for the commonly used Clinical/behavioural questionnaire/scales (although many are licensed and so objections may be encountered). Also, it might be worth adding some standardised clinical units (mmHg, bpm etc.,)

Original authors: Unknown

Replace _scans.tsv with _recordings.tsv

Right now the _scans.tsv file keeps track of which acquisitions exist within a particular session. The word “scans” is a little domain-specific, especially as people begin to write BIDS formats for things like electrophysiology. Why not call this recordings, as this is more domain-general and still reflects the same overall idea?

Original authors: Unknown

Use Augmented Backus–Naur form instead of definition templates

See example from Wikipedia:

https://en.wikipedia.org/wiki/Augmented_Backus–Naur_form#Example

Original authors: Unknown

MEEG+iEEG --> Make it a requirement that channels in bids sidecar and raw file are in order

came up in mne-tools/mne-bids#697 (comment)

We currently have the following in the spec:

EEG:

To avoid confusion, the channels SHOULD be listed in the order they appear in the EEG data file.

source

iEEG:

Channels SHOULD appear in the table in the same order they do in the iEEG data file.

source

MEG:

To avoid confusion, the channels SHOULD be listed in the order they appear in the MEG data file.

source

our proposal (cc @hoechenberger @agramfort ) is to turn this "SHOULD" into a "MUST". This would be in line with the BIDS principle to place most burden on the dataset curation stage, as that's a one-time-burden. All steps that follow after this will have an easier time. For example, tools can rely on the sidecar files, and don't have to double check with the raw files.

Potential issue(s):

We currently cannot validate a MUST, because we lack JS readers for ephys formats
Breaking backwards incompatibility for all currently existing datasets that did not follow the RECOMMENDATION ("SHOULD") that we now want to turn into a REQUIREMENT ("MUST")

Somewhat parallel issue(s):

bids-standard/bids-specification#667

Remove inheritance rule

https://groups.google.com/forum/#!topic/bids-discussion/TjxOKEB1DD4

Right now I would say that BIDS is "experiment" centric, where experiment is some collection of data from many participants from a single site that is somehow defined by the experiment (or other justification) that was used to collect the data (openfmri calls these datasets, but that word can be applied at many scales so I am avoiding it). As a consequence, much of the data is "bunched" together at the experiment level, for example participant.tsv and the inheritance principle allowing scan parameters at the highest level. In the future, I would expect that more and more users of shared data will be aggregating data across many different datasets, which is a very common mode for FCP/INDI data. When they do this, it is unlikely that users will want to download all of the data for every experiment, but rather cherry pick the data that they want from the many experiments. I think that this would be much easier to accomplish if we changed the focus of BIDS from the experiment bundle to the lowest denominator - the participant/session level.

This would entail requiring that sidecar JSON and task description files be stored adjacent to the nifti file that they complement and get rid of inheritance. Also, this would involve moving the participant pheno, demo, and assessment data into the session folder. The result of this would be that ALL of the information that is necessary to analyze the dataset is available in the sub/session folder. This enables a more flexible file structure, where users can access only the data that they want.

Inheritance is the weakest part of BIDS and opens the door to a host of problems. I understand that it has the potential of considerably reducing the number of files, but comes at a large cost in complexity and inflexibility. If we do decide to maintain inheritance in the future, it would be much better to explicitly list the inheritance structure in the JSON files, rather than relying on the file structure. This will also make it much more efficient to find meta data (less time probing file systems for files that may or may not exist).

Another reason to remove inheritance is that, while it may be clear how to inherit values in a JSON file, inheriting values in other meta-data file types may be more complex and ambiguous. As raised in bids-standard/bids-specification#337, a single named TSV column requires other columns to be interpreted (i.e. onset and duration, likely event_type) and so certain rows of other columns will also have to be inherited.

Perhaps a practical way forward is to limit the meta-data file types that can participate in inheritance.

Original authors: @ccraddock

"IntendedFor" field in electrophysiology sidecars should bear a different name

Hello,

the JSON sidecar file of electrophysiological recordings has the optional parameter IntendedFor. Its value is "intended" (no pun … intended! 🙈) to point to structural images, e.g. MRI or CT scans. However, the parameter name is as ambiguous as can be, and IMHO sparks incorrect associations, at least for me: if my electrophysiological recording is "intended for" a certain structural scan, then this implies that the "main" bit of the data is in the structural scan, not in the electrophysiological data. I don't think this sort of hierarchical implication makes much sense here.

Considering that MEG data comes with the optional field AssociatedEmptyRoom, I was wondering if IntendedFor couldn't be renamed to something like, AssociatedAnatomy or AssociatedAnatomicalScan (or the plural forms thereof)?

Use IETF RFC 2119 keywords to indicate requirement levels

https://tools.ietf.org/html/rfc2119

This would improve the precision of the specification.

Original authors: Unknown

Add magnetization transfer

Also find a way to organize data for MT-related protocols (e.g. mt0+mt1 to compute MTR, mt0+mt1+t1w to compute MTsat, etc.). B1 mapping should also be accounted for in case researchers acquire it (but then, B1 map could be used for other purposes than MT protocol, so maybe not put it in the mt folder).

Original authors: @jcohenadad

Consider replacing data dictionaries with Table Schema

https://frictionlessdata.io/specs/table-schema/

Original authors: Unknown

Move and rename IntendedFor field from fieldmap to functional image

Under the current BIDS Specification, the field defining which fieldmaps are associated with each epi image are defined in the fieldmap. Accessing this information requires some kind of backwards logic and searching through all fieldmap metadata to find which ones correspond to a given file, my solution to this is moving the field to the epi image metadata in a new field like AssociatedFieldmaps or AssociatedDistortionmaps, meaning the information as to which fieldmap is associated with each image is easily accessible by calling that epi images metadata.

Original authors: @akimbler

Revert to one directory per scan

The BIDS 1.x directory structure places scans within anat, func, dwi folders. Within each folder one can have multiple different scans. I propose to go back to the one scan per folder structure so as to keep a files for one scan together. This is especially useful for the derivatives folder where having multiple scans of the same type residing in one folder will quickly result in a potential mixing of derivatives. With properly named sequences on your scanner console it will then become trivial to place dicoms and subsequent niftis in new folders by stripping the folder name from the scanname.

BIDS 1.x

sub-control01
- anat
  - sub-control01_T1w.nii.gz
  - sub-control01_T1w.json
  - sub-control01_T2w.nii.gz
  - sub-control01_T2w.json
- func
  - sub-control01_task-nback_bold.nii.gz
  - sub-control01_task-nback_bold.json
  - sub-control01_task-nback_events.tsv
  - sub-control01_task-nback_physio.tsv.gz
  - sub-control01_task-nback_physio.json
  - sub-control01_task-nback_sbref.nii.gz

Proposal for BIDS 2.x

sub-control01
- T1w
  - sub-control01_T1w.nii.gz
  - sub-control01_T1w.json
- T2w
  - sub-control01_T2w.nii.gz
  - sub-control01_T2w.json
- Nback
  - sub-control01_task-nback_bold.nii.gz
  - sub-control01_task-nback_bold.json
  - sub-control01_task-nback_events.tsv
  - sub-control01_task-nback_physio.tsv.gz
  - sub-control01_task-nback_physio.json
  - sub-control01_task-nback_sbref.nii.gz
- MID
  - sub-control01_task-mid_bold.nii.gz
  - sub-control01_task-mid_bold.json
  - sub-control01_task-mid_events.tsv
  - sub-control01_task-mid_physio.tsv.gz
  - sub-control01_task-mid_physio.json
  - sub-control01_task-mid_sbref.nii.gz

Original authors: Unknown

Replacing "task" with "trial"

https://groups.google.com/forum/#!topic/bids-discussion/mNdO81h28eg

I am looking to implement the BIDS specification for the small rodent fMRI pipelines which me and a number of colleagues use ( https://github.com/IBT-FMI/SAMRI ).

Generally, the specification seems ok for animal data as well, with one big exception: It is highly uncommon for rodents to perform tasks during fMRI. So the task nomenclature would be highly curious and potentially misleading. Would it be possible to update the term to something more generic and species-agnostic in a future release? e.g. "trial". We are currently using "trial" in our pipeline to denote what BIDS is referring to as "task".

The name switch to "trial" would cause no ambiguity other than with the proposed "trial_type" column for the events files:

Another optional column - “trial_type“ - represents primary categorisation of each trial to identify them as instances of the experimental conditions.

If this is a concern, I would propose to update the column name to "condition_type" - which actually makes more sense, and is more accurately descriptive (as seen even in the sentence above, quoted from BIDS-1.0.1).

Original authors: @TheChymera

Drop "ChannelCount" fields in ephys data

Some channel types have counts in the .eeg JSON files and others do not. Newly declared channel types GSR, RESP, and TEMP do not have counts

GSRChannelCount
RESPChannelCount
TEMPChannelCount

While EOG, EMG, ECG do. Worse, when the fields above are defined, the BIDS validator fails. This is made even worse by the fact that EOGChannelCount, EMGChannelCount, and ECGChannelCount are mandatory fields. The error from the validator is not informative when trying to add these fields (failed because of badly formated .eeg file). I have tried to look at the JS code but it is too convoluted. I beg whoever is able to change this code to return meaningful errors.

Overall, I would remove this redundant information of channel count (except for EEG and non-EEG channel counts). One can simply look at the channel file to see what type of channels are there. This is a source of potential error and inconsistency. Big choice obviously.

Deprecate phase and magnitude suffixes in favor of part and echo entities

If bids-standard/bids-specification#424 is merged, then there will be an entity that delineates magnitude/phase in the specification. There already is one that delineates multi-echo data (echo). These two entities can replace the existing func suffix “phase” and the existing fmap suffixes “phase1”, “phase2”, “magnitude1”, and “magnitude2”. I’m not sure how best to label “phasediff” with the current proposal, but I’m sure we can figure out a way.

Original title: Deprecate phase, phase1, phase2, phasediff, magnitude1, magnitude2 suffixes in favor of part and echo entities
Original authors: @tsalo

Allowing hyphens in filename key values

https://groups.google.com/forum/#!topic/bids-discussion/pG7TnsZNfyA

This could be a mean expressing multiple values to a single key

Original authors: @bthirion

Software Ecosystem Plan

I would like to see as part of the v2.0 standard, a concrete plan of action for creating a robust software ecosystem around BIDS. I believe that standards should come with as complete as possible tooling to support that standard. While it's true that many apps have been developed to be BIDS-compatible, I think that the current approach is not communally focused and instead requires going back and forth between major packages and asking them to implement changes for BIDS, rather than implementing complete solutions. I would also propose that a full-fledged 2.0 would minimally consist of the following software solutions (some of which already exist):

A web-based validator tool.
A command-line-based validator tool.
A conversion tool which guides the user through the creation of a full BIDS-compliant dataset, or trivially adds to an existing one. (Note: dcm2niix and heudiconv exist to facilitate this, but still require that the user follow the specification and validate. I propose that the program follow the specification and validate; this is a reason for requiring (2)).
A tool which queries metadata via the command line to enhance scriptability.

If there exists some plan already and I've missed it, please feel free to close this issue.

Keep the filenames short

https://groups.google.com/forum/#!topic/bids-discussion/IdWe0R8OFyw

ALL parameters should be included in the JSON file, regardless of their inclusion in the filename or path, and that values in the JSON file be lists, so that they can easily accommodate 4D files when necessary. For example, flip angle would be a list that could be more flip angles.
The length of the filename should be restricted to be no longer than 32 characters. We can discuss the actual length, but given that we can expect a long path the proceed the filename, and need to rely on the entire file path, it should not be too long. I believe that the current max length for filenames in AFNI is 255 characters and anything beyond that gets truncated, but even if AFNI increases this value to 1024, or 102040 there will always be a max, and we need to enforce some length restriction in the spec. I also challenge the human readability of a filename that is this long.
That we do not add any new key value pairs to the filename other than the ones in the current standard.
That we allow the "-" character to occur inside values. We can then state that for a key value pair, the value before the first hyphen is the key and everything after is the value. This would accommodate Bertrand's concerns regarding subject identifiers, as well as allow the ACQ field to be overloaded to contain more key value pairs for software and users that wish to overload the ACQ tag to include more information, such as MT-ON, Echo-1, or otherwise. Software that does not support this overloaded would then just interpret the entire ACQ value as a string that points them to the correct .json file. As Chris points out, this may break some pre-existing implementations, but since it wasn't clearly defined before, we need to pin it down.

Original authors: @ccraddock

Drop sidecar JSON files and store all metadata in the header

This could be done with NIFTI header extension. JSON could still be used internally. This would also remove the need for the hierarchical rule (which is quite confusing). The downside is that is that it’s harder to view and edit metadata (without specialized tools).

Original authors: Unknown

Allow expressing parameters in different units

https://groups.google.com/forum/#!topic/bids-discussion/TjxOKEB1DD4

replace numeric values in the JSON files with either a tuple or dict that includes at least the value and units and potentially the datatype, and maybe even an ontology reference. For example:

{     
    "RepetitionTime": {
         "value": "645",
         "units": "msec",
         "datatype": "int16"
    }
}

Original authors: @ccraddock

File format for raw data

ISMRMRD looks very interesting: https://ismrmrd.github.io/

Example usage: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5349250/

Original authors: Unknown

N/A vs NaN for missing values

So this may seem like a minor gripe, but the specification is currently very clear on the “Missing and non-applicable values MUST be coded as n/a” [overt caps in the original]. I am currently battling to BIDs align large multimodal datasets across many different types of pipelines/sources (not just imaging), and for several reasons this edict has proven to be particularly unhelpful. I would suggest relaxing (or changing) the rule in 2.0 (I personally have given up and will simply adding a caveat to my readme and .json files).

Original authors: @CPLambert

Allow ``IntendedFor`` and make ``task-`` optional for ``_sbref.ext``

Some fMRI studies will not acquire one SBRef for every multiband BOLD scan (this is probably true for DWI too, and thus bids-standard/bids-specification#239).

Currently, the task-Name entity needs to be defined for SBRefs. Doing it optional, would allow the following:

func/
  sub-3010_sbref.nii.gz
  sub-3010_task-discountFix_run-1_bold.nii.gz
  sub-3010_task-manipulationTask_run-1_bold.nii.gz
  sub-3010_task-motorSelectiveStop_run-1_bold.nii.gz
  sub-3010_task-rest_run-1_bold.nii.gz
  sub-3010_task-rest_run-1_sbref.nii.gz
  sub-3010_task-stopSignal_run-1_bold.nii.gz

Where, by the inheritance principle, sub-3010_sbref.nii.gz is intended for all tasks but rest because a particular sbref exists for that one.

Additionally, I propose to allow defining the IntendedFor metadata for SBRefs, so the same behavior can be implemented without missing the provenance about what task was acquired closest to the sbref, e.g.:

func/
  sub-3010_task-discountFix_run-1_bold.nii.gz
  sub-3010_task-discountFix_run-1_sbref.json
  sub-3010_task-discountFix_run-1_sbref.nii.gz
  sub-3010_task-manipulationTask_run-1_bold.nii.gz
  sub-3010_task-motorSelectiveStop_run-1_bold.nii.gz
  sub-3010_task-rest_run-1_bold.nii.gz
  sub-3010_task-rest_run-1_sbref.nii.gz
  sub-3010_task-stopSignal_run-1_bold.nii.gz

where sub-3010_task-discountFix_run-1_sbref.json has an IntendedFor field:

{
  "IntendedFor": [
    "sub-3010_task-discountFix_run-1_bold.nii.gz",
    "sub-3010_task-manipulationTask_run-1_bold.nii.gz",
    "sub-3010_task-motorSelectiveStop_run-1_bold.nii.gz",
    "sub-3010_task-stopSignal_run-1_bold.nii.gz"
  ]
}

Remove RepetitionTime metadata field

This information is already stored in the NIfTI header (as the current field description seems informed of). Apparently this field isn't intended to override the header either (which would have been a bad idea anyway), so it's basically useless data duplication, with all the potential inconsistency issues that brings. I would recommend entirely removing this field in order to minimize ambiguity.

Original authors: Unknown

Make capitalization of suffixes consistent

See: https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/bids-discussion/yOYaLNTh-_A/rPLd3JpsAgAJ

I see 3 possible routes

not care about capitalization (ie accept bold and BOLD)
not capitalize and keep the standard consistency
have a mix, no specific rationale with "old" suffixes and "NEW" ones

I must say 2 and 3 seem at first sight preferable to 1 - but I would probably go for 2 if there are no strong rationale for 3 (which there may be)

Original authors: @jbpoline

Enforce TE ordering for multi-echo fMRI data

Currently, individual echos in a multi-echo fMRI sequence can occur in an unspecified order, with the “shortest echo” i.e., the echo with the shortest TE, listed as either the first or last echo in the sequence.

This proposal would standardize echo naming such that *_echo-1* is always the shortest echo and all following echos are indexed in sequential order. This would enable parsing multi-echo fMRI data from the file names, without relying on associated metadata. It would also standardize echo ordering across datasets, preventing the confusion of co-existing “little-endian” and “big-endian” echo naming schemes.

Relevant discussion: https://groups.google.com/forum/#!topic/bids-discussion/Utv29uL0Bdc

Original authors: @emdupre

Harmonizing sequence/contrast names

https://groups.google.com/d/msgid/bids-discussion/BCC1A813-0207-4D16-B095-4311B50F090D%40gmail.com?utm_medium=email&utm_source=footer

Rationale: In trying to apply the BIDS format to our MR database, we noticed that the current modality labeling schemes of BIDS 1.0 raises some confusion for some of our scans. Particularly, a consensus of many anatomical MR output was missing or not consequent across different sequences. Looking into the BIDS mailing list, it turns out that we were not the only one having these issues. To our opinion, most of the confusion stems from the intermixed use of sequence-, contrasts- and signal- labels.

To prevent confusion while still being able to fit a format to all possible scan types, we propose the following changes for BIDS 2.0 and where possible for backwards compatibility with BIDS 1.0:

All text in lowercase.
Punctuation marks:
1. underscore (_) : before each prefix
2. dash (-): after a prefix, to add labels, but not numbers!
‘parent’ labels
1. _out: referring to scanner output: in terms of volumes (regardless of how these are generated).
2. _seq: referring to the MR sequence that generated the volumes (with the possibility to add modifier labels)
3. _gen: referring to the result image that is generated/calculated with the output volumes
To prevent confusion we suggest to discard labelling for the specific ‘signal’ as modality (e.g. T1w, T2w, T2star). Often, scans are not purely t1w or t2w. Also, they might differ multiple output volumes for a specific sequence, resulting in messy and unclear labeling.
- format:
  - sub01_… _out-(out-label1)-(out-label2)_seq-(seq-label)-(seq-modifier)gen-(gen-label) … nii.gz

Label specs

format:
- sub01_… _out-{out-label1}--{out-label}_seq-{seq-label}-{seq-modifier}gen-{gen-label} … nii.gz

_out

output files, directly after reconstruction of K-space, referring to the volume-source.
_out-labels:
- inv# output volume for #th inversion time
- echo# output volume for #th echo
- fa# output volume for the #th flip angle
- coil# output volume of coil element #
- dyns# output volume(s) # of fMRI series
- phs phase output volume
- mag magnitude output volume

_seq

sequence, arbitrary as these are brand specific.
_seq-labels: [-- not exhaustive, add/edit where needed--]
- 3dffe
- flash
- mp2rage
- memp2rage
- mprage
- memprage
- epi
- tse
- flair
- pasl
- SPGR
- mef
Optionally, one can add sequence modifiers to specify the sequence.
_seq-modifiers:
- bold
- dwi
- swi
- t1w
- t2w
- t2starw
- b0
- b1
- mt0
- mt1

_gen

generated / calculated images
gen-labels:
- uni unified image if first and second inversion MP2RAGE volumes
- t1map
- t2map
- t2starmap
- qsmap quantitative susceptibility map
- mtr
- mtsat

Examples

MP2RAGEME

/anat/
- sub-01_ses-1_out-inv1-mag_seq-memp2rage.nii.gz
- sub-01_ses-1_out-inv1-phs_seq-memp2rage.nii.gz
- sub-01_ses-1_out-inv2-echo1_mag_seq-memp2rage.nii.gz
- sub-01_ses-1_out-inv2-echo1_phs_seq-memp2rage.nii.gz
- sub-01_ses-1_out-inv2-echo2_mag_seq-memp2rage.nii.gz
- sub-01_ses-1_out-inv2-echo2_phs_seq-memp2rage.nii.gz
- sub-01_ses-1_seq-memp2rage_gen-uni.nii.gz [note that _out is discarded in this _gen image]

functional EPI

/func/
- sub-01_ses-1_out-dyns100_seq-epi-bold.nii.gz

diffusion EPI

/dwi/
- sub-01_ses-1_out-dyns64_seq-epi-dwi.nii.gz

Original authors: Martijn Mulder

One directory per modality

It would be logical that the directory structure of the subject-level derivatives would reflect the structure of the raw data. Therefore, we may need to think the structure of the raw data with a bit more attention to general aspects of the processing. Joining modalities that may need completely different analysis under same the folder (anat, func and dwi) may end up generating problems. Not only derivatives can be overwritten but also having many different modalities in the same directory can be rather confusing (Not to speak about the problem of modalities like DfMRI (If someone manages to make it real) or future modalities we have not even heard about).

The way to go may be be one folder per modality, rather than one per scan.

All T1ws in the T1w directory.
All DWIs in the dMRI directory.
All task fMRIs in the tfMRI directory.
All resting fMRIs in the rfMRI directory.
All T2ws in the T2w directory.
All ASL in the ASL directory.
All FLAIR in the FLAIR directory.
All SWIs in the SWI directory.
Etc.

At the end, we would end with 10/12 directories at most, but probably better organised.

Original authors: Unknown

Alternative location for raw data

Raw data are currently stored under the root folder of the dataset.

The root folder gets cluttered by sub-<label> subfolders and hides important files such as README and derived data in derivatives.
This also creates an imbalance with source data that are stored in sourcedata and derived data in derivatives.
Perhaps raw data could be stored in an (optional?) rawdata subfolder.

Original authors: Unknown

Welcome! And migration from Google Doc to this repo

Welcome to this new repository everybody!

Please see this message on how to work with the suggestions in this repo: #1 (comment)

For context, please see the messages in the BrainHack Mattermost starting approximately with this message by @tsalo .

Next steps:

migrate all existing suggestions and discussions
render the BIDS 2.0 google doc "view only"
label the new issues

We need to migrate all suggestions / comments and preferably also discussions from the BIDS 2.0 google doc to this repository.

Who wants to help?
How should we proceed?
What's important to keep in mind?

Allow tabs in .bvec/.bval

dcm2niix and dicm2nii historically output .bvec/.bval that are tab separated. BIDS 1.0.0 specifies that those files should be space separated. A compromise might involve specifying those files as whitespace separated.

Original authors: Unknown

Treat associated modalities as independent

I am not completely set on this, but I wanted to float the idea to stir up some discussion. Basically, my idea is to move "associated data", like events and physio, to the beh data type/folder in all cases- not just when there's no corresponding imaging data.

I see two benefits:

A simplification of the rules. You don't need to know that events only go in beh if they don't have an associated functional scan- you just organize them the same way regardless.
Implicit support of multimodal acquisitions. If you acquired both fMRI and EEG at the same time, you don't need to worry about where to place these associated data files.

The main cost is in interpretability. The filenames would either have to match up in terms of entities or there would need to be some metadata field with a unique identifier (perhaps in _scans.tsv? see bids-standard/bids-specification#529).

Add a compulsory header to every JSON file

https://groups.google.com/forum/#!topic/bids-discussion/TjxOKEB1DD4

add a 'header' block to every JSON file that specifies the type of parameter file that it is and a spec version. This makes it a bit easier to verify that the file is what you think it is. This is particularly useful due to the very few mandatory keys in the files.

Original authors: @ccraddock

Replace ce entity with one that applies to MRI and PET data equally well

Currently, MRI uses ce (contrast enhancing agent) to indicate exogenous chemicals used to enhance contrasts, while the PET BEP, which uses tracers, plans to use the acq entity because ce doesn't reflect common terminology for PET users.

I propose that we define a new entity, perhaps called tracer, that would work for both technologies and would replace ce.

The addition of a tracer entity was proposed by @mnoergaard in bids-standard/bids-specification#633 (comment). In the hopes of not cluttering up the specification with a new entity in BIDS 1.X, I think it would be best to replace the existing ce in 2.0.

Support for multiple 3D files in addition to 4D files for bold

https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/bids-discussion/btpHFtu9xWo/dProjNAmAwAJ

It seems the discussion converged to use a _part-xxx key/value pair in the filename for data stored in multiple files. Could the same principle be used to allow 3D NIfTI files as well as 4D NIfTI files? This was already mentioned in another thread:

https://groups.google.com/forum/#!msg/bids-discussion/zFkdHeLcuMc/eDORAobIAAAJ

but the suggestion was lost in the discussion.

Original authors: @gllmflndn

Allowed characters in labels/keys

https://docs.google.com/document/d/1HFUkAEE-pB-angVcYe6pf_-fVf4sCpOHKesUvfb8Grc/edit?disco=AAAAB1zsEbg

Some labels that are desired to be used consist of concatenated abbreviations, and would be more readable with a separator character (e.g., "32k-fs-LR"). This restriction seems excessive.

One possibility is to actually allow dashes in values, while keeping them forbidden in keys. This can still be parsed consistently by splitting everything on underscores, and then splitting each of those on the first dash.

If this is unpalatable, perhaps another cross-platform-allowed filename character can be added to the set of allowed characters, such as '=' or '+'. These are not as easy to read as a dash, but at least gives people an option for cases that are otherwise difficult to resolve.

Original authors: Unknown

Change RepetitionTime definition to the same one as DICOM field 0018,0080

Current definition of RepetitionTime in BIDS is not the same as the one in DICOM which might cause confusion. The new definition would be:

The period of time in msec between the beginning of a pulse sequence and the beginning of the succeeding (essentially identical) pulse sequence.

RepetitionTime would also not be mandatory for _bold files and it’s previous role would be replaced by new mandatory TemporalResolution field or (similar TBD) field defined as:

The time in seconds between the beginning of an acquisition of one volume and the beginning of acquisition of the volume following it. Please note that this definition includes time between scans (when no data has been acquired) in case of sparse acquisition schemes. This value needs to be consistent with the ‘pixdim[4]’ field (after accounting for units stored in ‘xyzt_units’ field) in the NIfTI header. This field is mutually exclusive with VolumeTiming.

Relevant discussion:

https://groups.google.com/d/msg/bids-discussion/MLUqmcD1XSY/_lMkr00yAwAJ

https://groups.google.com/d/msg/bids-discussion/wtolT5qPjy0/k2avH_1mCAAJ

Original authors: @mharms

Change FLASH for non-proprietary name

FLASH is specific to Siemens. GE calls is SPGR, Philips FFE. I would use an acronym that is not vendor-specific, e.g. SGRE (for spoiled gradient echo), or simply GRE (and the information about spoiler could be added in the descriptive fields).

Original authors: @jcohenadad

Restructuring organization for participant level grouping

In BIDS thus far the notion of source data and derived data is a little contrived/vague. For example a multi-echo T1-weighted recon comes out of the scanner from a MEMPRAGE sequence is considered source data, while the FA image that comes out is not considered source data.

As scanners and other instruments get more advanced and start generating what we traditionally call derivatives (think GPU based processing on the scanner), this will lead to questions of where data goes.

To simplify consideration, the possibility I would like the BIDS community to consider is to separate data not by source vs derivatives, but by participant vs ~~aggregate~~ non-individual. As examples:

Participant

source dicoms
freesurfer recon
fmriprep output
meg windows around individual stimuli
average ERP response
...

~~Aggregate~~ Non-individual

Templates
group statistical maps
(partial) correlations
...

This makes it, in my opinion, simpler to consider with regard to both metadata and with respect to provenance.

Would love to hear thoughts on this potential reframing.

MINOR: Replace "units" with "unit" in channels.tsv

All other column names are specified in singular (e.g., type, description, …) and it is not logical to have a unitS column.

Original authors: @sappelhoff

Move task information from data jsons to events jsons

The following metadata fields describe tasks, and are required by the specification to be placed in the data file's associated metadata file instead of the events file's json:

TaskName
Instructions
TaskDescription
CogAtlasID
CogPOID

I think that it would be preferable to have these fields in the _events.json file instead, starting at BIDS 2.0.

This stems from bids-standard/bids-specification#573 (comment).

Homogenize "Subject" Nomenclature

We are currently using the terms “participant” and “subject” interchangeably. I propose we make the names homogeneous. In order to afford better extendability with preclinical research, the term “subject” would be significantly more apt.
GitHub Issue: bids-standard/bids-specification#384

This change would include renaming the participants.tsv file to subjects.tsv, and the participant_id column in said file to subject.

Original authors: @TheChymera

Multi-site/center studies

https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/bids-discussion/SwH-1KRnBU0/oCx0ynEpBAAJ

Our group would also be interested in finding a way to incorporate a way of distinguishing between sites for the datasets we work with. I personally would prefer to use the key 'site-' rather than 'centre-', as 'centre' is not a shared spelling between British and American English.

Maybe overall, all the participants from a single site could also live within a directory for that site. So the directory structure might be amended to look like:

/site-<site_label>/sub-<participant_label>/[ses-<session_label>/]

Original authors: @jpellman

Store date/time information in the JSON sidecar instead of _scans.tsv file

See rordenlab/dcm2niix#93

Original authors: @chrisgorgo

Consider replacing TSV files with a fileformat that supports built in metadata

Contestants:

ECSV - https://github.com/astropy/astropy-APEs/blob/master/APE6.rst
CSVY - http://csvy.org/
See also - https://twitter.com/hadleywickham/status/946836547097251840

Original authors: Unknown

bids-standard / bids-2-devel Goto Github PK

bids-2-devel's Introduction

bids-2-devel

bids-2-devel's People

Contributors

Stargazers

Watchers

Forkers

bids-2-devel's Issues

BIDS 1.x

Proposal for BIDS 2.x

Label specs

Examples

Recommend Projects

Recommend Topics

Recommend Org

Jobs