GithubHelp home page GithubHelp logo

cbs's Introduction

Scripts for Managing fMRI Data

#+ search mode org blah

Goals

This document describes the suite of scripts written to take a SPM batch file created for one subject and generate identical batch files for many subjects. These batch files are then submitted to the cluster via bsub or sbatch, eliminating the need for the GUI, and speeding up the process of analyzing (or reanalyzing) many subjects at once. In order to use the scripts, create your desired batch (preproc, or level1) in the SPM interface and use the ‘save as script’ option when saving the batch. This creates a file called batchname_job.m, which will serve as your template for the scripts below.

Prerequisites

System requirements: All scripts require that you be on the cluster, and were tested using Matlab 2012a with spm version 8.4667, and python version 2.7.3. Use with older versions of matlab is unsupported, and is known to fail using Matlab 2007. In order to access the scripts, you should be using the default environment as specified on the FAQ site (http://cbs.fas.harvard.edu/science/core-facilities/neuroimaging/information-investigators/faq). Please also see the FAQ for detailed information on how to select your matlab and SPM versions.

Directory structure: The scripts rely on the following directory structure which will automatically be created for you via the first script, getSubjectSPM.py.

|-mystudydir   
|---120418_mysubject
|------RAW
|---------dicom files...
|------analysisdir
|---------paradigms
|------------run001,run002,...
|---------------cond1.txt, cond2.txt...
|---------spm files from first level analysis...
|------batch
|---------spm batches...
|------preproc
|---------converted dicoms, preprocessed data...
|------output_files
|---------files containing error messages from scripts

Summary

The following scripts have been written and should be included in your default path if you are using the one listed on the FAQ site

Script NameDetails
getSubjectSPM.pyPulls data using cbsget and creates a directory structure
genPreprocBatches.pyGenerates preprocessing batches from template batch and executes batches
genL1Batches.pyGenerates level 1 batches from template and executes them

getSubjectSPM.py

getSubjectSPM.py is a script that pulls data from the CBSCentral repository (or a directory), converts it to SPM format (img or nii), and creates a directory structure that will allow the use of the other scripts described here.

It takes the following inputs in any order. For help, type getSubjectSPM.py -h:

getSubjectSPM.py -s SUBJECTID -b BOLDRUNS -t STRUCTRUNS -m FIELDMAP -p DESTINATIONPATH -d DICOMPATH -n USENIFTI

FlagStands forDescription
-hhelpshows usage message and exits
-ssubjectidthe name of the subject
-bboldrunsa list of run numbers for the BOLD scans, seperated by spaces
-tstructrunthe run number for the structural scan
-mfieldmap runsthe run numbers for the fieldmap, if present needs two runs
-pdestination paththe FULL path to the location to place the output, no relative path (i.e. no ../ or ~)
-ddicom paththe path to a single directory containing all of the unzipped dicoms (as opposed to using CBSCentral
-nuse niftiIf present, will use the single file NIFTI format (nii) insted of img/hdr
–analysis-diranalysis directorya list of directories for future analyses. Default is analysis. Useful if collected multiple experiments in one session.
–vmemvirtural memoryamount of memory requested, requred if using sbatch, default is 1024MB
–timetimeamount of time script will need, required if using sbatch, default is 2 hours
-run-withwhat to execute the script with, choices are bsub, sbatch, or dry. dry means just output the call to getsubject.m, default is sbatch

For example:

~getSubjectSPM.py -s 120418_spmtest -b 6 10 11 -t 3 -m 6 7 -p /ncf/mylab/myspace/myexp/ -n --analysis-dir analysis_univariate analysis_mvpa --run-with bsub

For getting dicom’s locally as opposed to CBSCentral (make sure not to have final / after path to dicoms):

~getSubjectSPM.py -s 120418_spmtest -b 6 10 11 -t 3 -m 6 7 -p /ncf/mylab/myspace/myexp/ -d /ncf/mylab/myspace/myexp/dicoms/subj1 

This creates the following directory tree:

|-myexp   
|----120418_spmtest
|-------RAW
|-------analysis
|----------paradigms
|-------------run001,run002,...
|-------batch
|-------preproc
|-------output_files

Within the RAW directory is a tarball (subjectid.tar.gz) containing the DICOMs in a compressed format. In the preproc directory will be the SPM converted files either nii or .img and .hdr.

The files have also been renamed. Because they are already in the subject directory, they have been stripped of their subjectid, and are renamed as follows:

File nameDescription
f-run001-006.imgImage 6 of the first BOLD run
s-struct.imgThe structural image for the subject
s-fieldmap-mag-01.imgThe magnitude of the fieldmap (if provided)
s-fieldmap_phase.imgThe phase of the fieldmap

Errors

If there is a problem with the script, the output will go to the screen (standard out) for debugging. Most likely issues are not having a config file for CBSget (see FAQ), having the wrong numbers for your bold runs, or the subject name of the data you are trying to unpack already exists.

genPreprocBatches.py

The goal of this script is to take a batch file created to perform preprocessing on a single subject and use it to analyze many subjects. This is done by saving your batch via the ‘save as script’ command in SPM. This creates a batchname_job.m file, which will serve as your template batch. This batch will be applied to all of the subjects provided, which can include the original subject that was used to create the template. This script has been tested with fieldmap, slice time correction, motion correction, indirect spatial normalization, and smoothing. If you use any additional steps, you should check that the generated batches are correct by comparing the ones created to the original.

genPreprocBatches.py -t TEMPLATE -p PATH -s SUBJECT1 SUBJECT2
or \ genPreprocBatches.py -t TEMPLATE -p PATH -f SUBJECTFILE

FlagStands forDescription
-hhelpprovides usage message and then exits
-ttemplate batchthe full path to, and name of the template batch created in the SPM GUI via a “save batch as script” command, that ends in _job.m
-ppaththe path to the directory that contains all of your subjects
-ssubjida subjid to create and execute the batch on, can be a list separated by spaces
-fsubject filea file containing your subjectids, with each ID on its own line, which can be used instead of -s flag

For example:

~genPreprocBatches -t /ncf/mylab/myspace/myexp/subject1/batch/preproc_job.m -p /ncf/mylab/myspace/myexp/ -s subject1 subject2~ 

This will create a batch file for each subject provided, and save it in subjid/batches. It will then bsub the created batch. You can check that your submitted jobs are running via the bjobs command (see FAQ for instructions).

Errors

If there is a problem with converting the template batch for each subject, the error messages will be placed in the the study directory mystudy, with the name errors_preproc followed by the date and time (to the min).

For example: errors_preproc2012_07_06_10h_23m

The output from the running of the batch (that comes via the bsub output) will be stored in subjid/output_files, with the name output_preproc followed by the date and time (to the min). This is where errors thrown by matlab or SPM will show up.

For example: output_preproc2012_06_20_11h_41m

genL1Batches.py

The goal of this script is to take a batch file created to perform first level analysis on a single subject and use it to analyze many subjects. This is done by saving your batch via the ‘save as script’ command in SPM. This creates a batchname_job.m file, which will serve as your template batch. This batch will be applied to all of the subjects provided, which can include the original subject that was used to create the template. To run this script, you need to have your paradigm files constructed.

Creating batch

There are a few quirks about how you can create your level1 batch.

  1. dont use the ‘replicate Subject/Session’ option in fMRI model specification.
  2. The names you use for your conditions will need to be the names of the text

files containing your stimulus onset values (see below), so don’t put spaces in the name.

  1. When you make your contrasts in Contrast Manager, you can use either the T- and F-contrasts,

or the T-contrast (cond/sess based) options. Do not use the Replicate option. The cond/sess method is preferred, as it is harder to make errors. However, you will still need to build your F-contrast by hand.

Running script

genL1Batches -t TEMPLATE -p PATH -s SUBJECT1 SUBJECT2
or \ genL1Batches -t TEMPLATE -p PATH -f SUBJECTFILE

FlagStands forDescription
-hhelpprovides usage message and then exits
-ttemplate batchthe full path to, and name of the template batch created in the SPM GUI via a “save batch as script” command, that ends in _job.m
-ppaththe path to the directory that contains all of your subjects
-ssubjida subjid to create and execute the batch on, can be a list separated by spaces
-fsubject filea file containing your subjectids, with each ID on its own line, which can be used instead of -s

For example:

~genL1Batches -t /ncf/mylab/myspace/myexp/subject1/batch/L1_job.m -p /ncf/mylab/myspace/myexp/ -s subject1 subject2~ 

This will create a batch file for each subject provided, and save it in subjid/batches. It will then bsub the created batch. You can check that your submitted jobs are running via the bjobs command (see FAQ for instructions).

Stimulus onset files

Within the analysis directory is a paradigms directory, with a directory for each run, run001. For the first level analysis, each condition should have it’s own onset text file, with each row being a single onset time. The name of the file should be the name given to each condition within the SPM batch, followed by the .txt extension, cond1.txt. Therefore, if you have 3 runs, you will end up with three text files for cond1. They will all be called cond1.txt, but placed in each run directory run001, run002, and run003. If your stimulus is presented 4 times per run, than each of those files will have 4 rows, with each row having the time in seconds (or TRs, depending on what you specify in your batch) when your stimulus was presented. These can be made up in matlab, or any text editor.

Errors

If there is a problem with converting the template batch for each subject, the error messages will be placed in the the study directory mystudy, with the name errors_L1 followed by the date and time (to the min).

For example: errors_L12012_07_06_10h_23m

The output from the running of the batch (that comes via the bsub output) will be stored in subjid/output_files, with the name output_L1 followed by the date and time (to the min). This is where errors thrown by matlab or SPM will show up.

For example: output_L12012_06_20_11h_41m

genArtFiles.py

The goal of this script is to set up files and parameters for rerunning your level1 analysis with ART. Currently, the global_mean type is hard coded to be type 1, or standard, and the motion_file_type is set to 0, for a SPM .txt file.

genArtFiles -p PATH -s SUBJECT1 SUBJECT2 -gt GLOBALTHRESHOLD -mt MOTIONTHRESHOLD -g DIFFGLOBAL -m DIFFMOTION -n NORMS
or \ genArtFiles -p PATH -f SUBJECTFILE -gt GLOBALTHRESHOLD -mt MOTIONTHRESHOLD -g DIFFGLOBAL -m DIFFMOTION -n NORMS

FlagStands forDescription
-hhelpprovides usage message and then exits
-ppaththe path to the directory that contains all of your subjects
-ssubjida subjid to create and execute the batch on, can be a list separated by spaces
-fsubject filea file containing your subjectids, with each ID on its own line, which can be used instead of -s
-gtglobal mean thresholdthreshold for excluding outliers, in stdev away from the mean
-mtmotion thresholdthreshold for excluding outliers, in mm of movement
-gglobal diff1=yes, 0=no, whether you want to ‘Use Differences” for global mean threshold
-mmotion diff1=yes, 0=no,use movement differences, not absolute from first tp
-nuse norms1=combine all movement directions (linear and angular) 0=no

For example:

~genArtFiles -p /ncf/mylab/myspace/myexp/ -s subject1 subject2 -gt 2 -mt .5 -g 0 -m 0 -n 1~ 

This will create a new directory called art_analysis, at the same level as the original analysis directory. This directory will contain several files need for Art, or created by Art: art_config001.cfg, art_exec001.m, art_mask.hdr/img, art_mask_temporalfile.mat, SPM_stats_file. It will also created new regression file for regressing out outliers with or without motion (art_regression_outliers_swrf-run001-001.mat, or art_regression_outliers_and_movement_swrf-run001-001.mat). There will be one of each for every run.

Errors

If there is a problem creating the files for Art, the error messages will be placed in the the study directory mystudy, with the name errors_ART followed by the date and time (to the min).

For example: errors_ART_2012_07_06_10h_23m

The output from the running of the batch (that comes via the bsub output) will be stored in subjid/output_files, with the name output_ART followed by the date and time (to the min). This is where errors thrown by matlab or SPM will show up.

For example: output_ART2012_06_20_11h_41m

genL1ArtBatches.py

The goal of this script is to take a batch file created to perform first level analysis using ART outlier exclusion on a single subject and use it to analyze many subjects. The usage and output is the same as genL1Batches except that the output goes in to art_analysis directory.

Running script

genL1ArtBatches -t TEMPLATE -p PATH -s SUBJECT1 SUBJECT2
or \ genL1ArtBatches -t TEMPLATE -p PATH -f SUBJECTFILE

FlagStands forDescription
-hhelpprovides usage message and then exits
-ttemplate batchthe full path to, and name of the template batch created in the SPM GUI via a “save batch as script” command, that ends in _job.m
-ppaththe path to the directory that contains all of your subjects
-ssubjida subjid to create and execute the batch on, can be a list separated by spaces
-fsubject filea file containing your subjectids, with each ID on its own line, which can be used instead of -s

Errors

If there is a problem with converting the template batch for each subject, the error messages will be placed in the the study directory mystudy, with the name errors_L1ART followed by the date and time (to the min).

For example: errors_L1ART2012_07_06_10h_23m

The output from the running of the batch (that comes via the bsub output) will be stored in subjid/output_files, with the name output_L1ART followed by the date and time (to the min). This is where errors thrown by matlab or SPM will show up.

For example: output_L1ART2012_06_20_11h_41m

Acknowledgments

These scripts were written by Alex Storer, Caitlin Carey and Stephanie McMains with additional assistance from David Dodell-Feder.

cbs's People

Contributors

alexstorer avatar activatedvoxel avatar ccarey avatar

Stargazers

Venkatesh Sharma avatar  avatar  avatar Erik Kastman avatar

Watchers

 avatar  avatar  avatar

cbs's Issues

get subject file format

I would like to add a flag to this to specify if want nifti or img files, this is an option in the spm dicom converting function spm_dicom_convert.

currently the command is spm_dicom_convert(hdrs, 'all'), and it think it changes to

spm_dicom_convert(hdrs, 'all', 'img')

or spm_dicom_convert(hdrs, 'all','hii')

file for subjects in all scripts

all but getsubject should take a file for subjects. I tried testing this with genPreprocBatches and got an error message that couldn't separate on new line, even though it was a new line separated file.

Verify Length of Art Regressors

We would expect there to be six motion parameters, and however many outliers came from ART. In our test, it looked like we got 7.

genL1ArtBatches

I wanted this to take a flag as to whether you want to use the art regressors with or without motion. currently this is hard coded as with motion. the place to use the flagwould be line 56/67/76 of genL1PostArt.m, where it sets the regressors files.

Script to run batches

Is it in parallel or in serial? Takes the output from genPreprocBatches and run them.

genL1Art

add help comments, verify that takes a file.

paradigm directory

we would like getsubject to make a directory called paradigms inside analysis for people to store their paradigms in.

genArtFiles -b should be -p

we have a flag for setting the basedir in genArtFiles. This appears to be the same flag as in the genPreprocBatches and genL1Batches, but there it is called -p for path. I think we should keep this consistent.

BSUB Brainstorm

Here is where I will post all of my ideas about how to submit the jobs to the cluster:

  1. When making the batch scripts for each subject, we should also make bsub files for each. The format follows:

!/bin/sh

BSUB -q ncf

BSUB -J [SUBJECTID]

BSUB -o [SUBJECTID].out

matlab -nodisplay -r "run('[BATCHNAME]');"

  1. After we make all of the batch/bsub scripts and determine which could not be made and why (ex: wrong number of runs), we should send off the bsubs to the cluster.

  2. A small matlab script (or multiple scripts) should be made that takes the batch, loads it, and runs it in SPM/ART. This is what will get bsub-ed (above I just called it "run").

  3. A script operating at the same level as the one that creates the batches (ex: is launched locally/concurrently) waits for the .out files for all subjects to be returned (progress bar?).

  4. If we could figure out a way to have matlab print errors to stderr... then we could also check if any subjects crashed. I am not sure how to do this. And looking on stackoverflow... it appears that it may be impossible. In that case, we might actually need to write out our own error files as textfiles, which is added hassle but not terrible (just a try-catch).

  5. We should probably then clean up all of the .out and .err files and put them into a nice report textfile. Because no one likes getting 100 .out files.

THE END.

getsubject help

Make help for getsubject more complete, so includes how things should be typed in, and what to do if don't want a flag.

cbsget erroring out

when cbs errors out, it should remove the directories it created, otherwise can't rerun.

getsubject error messages

getsubject should throw a message when it fails to make all the files. In this case, it failed to read the second bold run (20) because the dicoms were gzipped.

error message in matlab:
In spm_dicom_headers>readdicomfile at 64
In spm_dicom_headers at 27
In getsubject at 134
Warning: "110808_12v32vis_SubjA_32ch_series_020_file_080.dcm.gz" is not a DICOM
file.

python getSubjectSPM.py -p /users/mcmains/mri01_users/mcmains/spm_scripting/test_1 -s test -b 13 20 -t 5 -d /users/mcmains/mri01_users/mcmains/coil_compare/110808_12v32vis_SubjA_32ch/RAW/spm_raw

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.