seqasim / lfpanalysis Goto Github PK

View Code? Open in Web Editor NEW

6.0 6.0 3.0 329.07 MB

Python 1.34% Jupyter Notebook 98.66%

lfpanalysis's People

Contributors

Stargazers

Watchers

Forkers

aliefink shawnrhoads paruljainneuro

lfpanalysis's Issues

Missing custom ROI label excel after pip install

Unable to determine custom ROI label using analysis_utils.select_rois_picks for bipolar re-referenced channels because custom ROI excel (YBA_ROI_labelled.xlsx) is missing from package data following pip install.
Recommendations:

Update setup.py to include include_package_data=True
Save YBA_ROI_labelled data as .csv instead of .xlsx
Add optional argument to analysis_utils.select_rois_picks to allow individualized custom ROI labels

issues with importing local environment.yml

Hey Salman, I was debugging with Shawn to import your environment locally. I think you need to re-export your environment on Mac using this command in Anaconda prompt, so it doesn't have extra build information with packages.

Windows:
conda env export --no-builds | findstr -v "prefix" > environment.yml
Mac:
conda env export --no-builds | grep -v "prefix" > environment.yml

In addition, I created a temporary testing folder to test env installation.

environment.yml set up module issues

I followed the environment installation instruction on my new desktop computer. Installation and upgrade all worked well. I was able to change the kernel and interpreter in VScode under lfp_env. However, there are number of modules were not available for import (including FOOOF). Are we expected to manually add the modules in conda lfp_env after installation? (I added scipy and seaborn manually in anaconda navigator)

Thanks for any help.

speed up pre-processing code

currently, it takes ~ 3 minutes to load and re-reference neural data for one subject. This is too slow. Presumably most of the time wasted has to do with the iterative process for white-matter re-referencing. There should be a more efficient (read: vectorized) way to do this and save time.

Implement spectral/connectivity measures from outside of MNE

i.e.

Phase transfer entropy: https://github.com/patrk/pyPTE
demodulated band transform: https://github.com/ckovach/DBT
etc

Account for lesion electrodes in the wm_ref code

Just copy and modify the "oob" designation to capture lesion electrodes

Repeated downsampling

Current approach downsamples once at the beginning of preprocessing and again during make_epochs function. Consider removing second downsampling step.

lfp.preprocess_utils.wm_ref() misses oob/wm elecs that were not manually examined

The function seems to skip oob and wm electrodes that do not have any label in their associated manual examination column. Removing the auto detect step from the for loop that creates wm/oob indices seems to resolve this issue.

wm_elec_ix_manual = [] 
wm_elec_ix_auto = []
oob_elec_ix_manual = []
oob_elec_ix_auto = []

if 'Manual Examination' in elec_data.keys():
    wm_elec_ix_manual = wm_elec_ix_manual + [ind for ind, data in elec_data['Manual Examination'].str.lower().items() if data=='wm' and elec_data['label'].str.lower()[ind] not in bad_channels]
    oob_elec_ix_manual = [ind for ind, data in elec_data['Manual Examination'].str.lower().items() if data=='oob']
elif 'ManualExamination' in elec_data.keys():
    wm_elec_ix_manual = wm_elec_ix_manual + [ind for ind, data in elec_data['ManualExamination'].str.lower().items() if data=='wm' and elec_data['label'].str.lower()[ind] not in bad_channels]
    oob_elec_ix_manual = oob_elec_ix = [ind for ind, data in elec_data['ManualExamination'].str.lower().items() if data=='oob']  

wm_elec_ix_auto = wm_elec_ix_auto + [ind for ind, data in elec_data['gm'].str.lower().items() if data=='white' and elec_data['label'].str.lower()[ind] not in bad_channels]
oob_elec_ix_auto = [ind for ind, data in elec_data['gm'].str.lower().items() if data=='unknown']

wm_elec_ix = np.unique(wm_elec_ix_manual + wm_elec_ix_auto)
oob_elec_ix = np.unique(oob_elec_ix_manual + oob_elec_ix_auto)
all_ix = elec_data.index.values
gm_elec_ix = np.array([x for x in all_ix if x not in wm_elec_ix and x not in oob_elec_ix])

Add oscillation detection

Adapt eBOSC to identify true oscillatory bursts

mislabeling in elec_df

In the condensed notebook, the following line is used to replace the monopolar labels in elec_df with the referenced labels in your mne data .ch_names:

elec_df['label'] = epochs_all_subjs_all_evs[subj_id][event].ch_names

As noted by @shawnrhoads this has bug potential if the .ch_names are not in the same order as the elec_df labels!!

Add time-resolved FOOOF capability

As in https://github.com/lucwilson/SPRiNT/tree/main

implement proper package install

I don't want to have environment-specific yml files - how do I make the setup for this as seamless as possible?

nlx_utils throws a lot of warnings while parsing header and datetime from .ncs files

I don't know the implications of this yet, but worth investigating

Lines 76-82

    # Try to read the original file path
    try:
        assert hdr_lines[1].split()[1:3] == ['File', 'Name']
        hdr[u'FileName']  = ' '.join(hdr_lines[1].split()[3:])
        # hdr['save_path'] = hdr['FileName']
    except:
        warnings.warn('Unable to parse original file path from Neuralynx header: ' + hdr_lines[1])

Lines 90-98

    # Read the parameters, assuming "-PARAM_NAME PARAM_VALUE" format
    for line in hdr_lines[4:]:
        try:
            name, value = line[1:].split()  # Ignore the dash and split PARAM_NAME and PARAM_VALUE
            hdr[name] = value
        except:
            warnings.warn('Unable to parse parameter line from Neuralynx header: ' + line)

    return hdr

Lines 128-139

    # Parse a datetime object from the idiosyncratic time string in Neuralynx file headers
    try:
        tmp_date = [int(x) for x in time_string.split()[4].split('/')]
        tmp_time = [int(x) for x in time_string.split()[-1].replace('.', ':').split(':')]
        tmp_microsecond = tmp_time[3] * 1000
    except:
        warnings.warn('Unable to parse time string from Neuralynx header: ' + time_string)
        return None
    else:
        return datetime.datetime(tmp_date[2], tmp_date[0], tmp_date[1],  # Year, month, day
                                 tmp_time[0], tmp_time[1], tmp_time[2],  # Hour, minute, second
                                 tmp_microsecond)

add site-specificity to data loading code

Sometimes we record across different hospitals, which have different naming conventions etc. Either there should be one universal solution to loading this code (lol) or we should have site as an input to functions that load code with if condns to do site-specific data wrangling

Add `pycache/` to `.gitignore`

I recommend adding __pycache__/ to .gitignore so the repo can be used more easily across multiple machines. Then, delete the __pycache__/* contents within the Github repo.

It might also be helpful to add a few other Python specific things to ignore. See boilerplate here: https://github.com/github/gitignore/blob/main/Python.gitignore

Sinai nlyx files fail with make_mne

error message: 'local variable mne_data referenced before assignment'

mni x y z coordinates as float to calculate wm electrode distance

wm_elec_dist = np.linalg.norm(elec_data.loc[wm_elec_ix, ['x', 'y', 'z']].values.astype(float) - elec_loc, axis=1)

in line 250 of LFPAnalysis/lfp_preprocess_utils.py instead of

wm_elec_dist = np.linalg.norm(elec_data.loc[wm_elec_ix, ['x', 'y', 'z']].values - elec_loc, axis=1)

'mne.io' has no attribute 'fif'

Bug detected in lfp_preprocess_utils.py detect_IEDs function line 567 elif type(mne_data) == mne.io.fif.raw.Raw

   549 elif type(mne_data) == mne.io.fif.raw.Raw: 
    550     data_type = 'continuous'
    551     n_times = mne_data._data.shape[1]

AttributeError: module 'mne.io' has no attribute 'fif'

This can fixed by chaging the line to elif type(mne_data) == mne.io.Raw

Errors with electrode name matching

Sometimes the electrode names from the .edf file do no match those from the electrode label sheet. We use Levenshtein distance to mitigate this, but this sometimes results in 'ties' between the correct match and another similar electrode name.

n_cycles in computing wavelet transforms is slightly misleading when determining minimum time length

I've been assuming the wavelet length is n_cycles/freq. it's not (https://mne.discourse.group/t/help-understanding-tfr-n-cycles/7257)

would be good to hard code this in somewhere so people know how many cycles to use based on how long their data is

Option to skip automatic bad trial detection in make_mne function

Should make detect_bad_elecs function call in make_mne function optional. There are times when manual examination of electrodes is required instead of automated detection.

implement laplacian re-ref

why not have more reference options?

ref: https://doi.org/10.1016/j.neuroimage.2018.08.020

re-factor pre-processing and artifact detection

I want to change up the way we pre-process the data and detect IEDs, to a) improve data quality, b) reduce data rejection rates, and c) minimize inter-researcher variability from manual decision-making. To that end, I want to:

Remove manual channel rejection prior to re-referencing
Only perform artifact identification and rejection prior to epoching + baselining
Revise IED detection to a higher z-score threshold (4->5)
Add changepoint detection in the raw data with the same z-score threshold (z=5)
NaN out the 100 ms before and after each artifact detected

References:

#3 - #5

Improve FOOOF model fit checks

Model trouble shooting
get readout for over/under fitting.

Enable more intuitive selection of brain regions for analyses

Need to assign electrodes to certain regions/subregions using the ATLAS labels in a consistent way across patients and researchers. Requires modification of select_picks_rois in lfp_preprocess_utils.analysis_utils

create condensed notebook for preprocessing

Currently I step through the preprocessing in two notebooks - since I've written all of this into functions let's condense this into a single notebook using those functions

Add better support for continuous, non-epoched data analysis

In particular, people may want to perform sliding window analyses of resting state data. In this case, my favorite approach would be to use MNE's built in tool for making fixed-length epochs (e.g. 500 ms) with an overlap/step size (e.g. 250 ms), like so:

epochs = make_fixed_length_epochs(raw=raw, duration=0.5, overlap=0.25)

This is pretty powerful as it can basically turn any epoched analysis we perform in this repo into a sliding window analysis of longer stretches of data!

Some bugs in `make_mne()` for nlx format

I ran into a few issues trying to load the .ncs files from neuralynx into an mne object. Attaching my notes and potential solutions.

1. `lfp`, `sr`, and `ch_name` are not defined on line 652:

LFPAnalysis/LFPAnalysis/lfp_preprocess_utils.py

Lines 652 to 654 in bacc1f8

 lfp.append(fdata['data']) 

 sr.append(fdata['sampling_rate']) 

 ch_name.append(str(ch_num))

Potential solution: add the following before the nsc_files loop, assuming lfp is a numpy array and the rest are lists (which is what mne.create_info() and mne.io.RawArray() require:

lfp = np.array([])
sr = []
ch_name = []

Then lines 652-654 should be replaced with something like (but see point 3 below for issue with lfp when channels are different sizes):

if lfp.size == 0:
    lfp = fdata['data']
else:
    lfp = np.vstack([lfp, fdata['data']])
sr.append(fdata['sampling_rate']) 
ch_name.append(str(ch_num))

2. Line 642 globs the all `.ncs` files in `load_path` but these might not match the pattern needed for the `ch_num` assignment:

LFPAnalysis/LFPAnalysis/lfp_preprocess_utils.py

Line 646 in bacc1f8

ch_num = int(chan_name[4:])

Potential solution 1: If ch_name should be referring to files that contain the pattern _0000.nsc to _9999.nsc (assuming this because this value is being converted to an integer), then maybe add something like this in place of line 642:

pattern = re.compile(r"_\d{4}\.ncs")  # regex pattern to match "_0000.ncs" to "_9999.ncs"
ncs_files = [x for x in glob(f'{load_path}/*.ncs') if re.search(pattern, x)]

If the case, then would also need to change line 646 to ch_num = int(chan_name[-4:])

Potential solution 2: If ch_name should be referring to the base name of each file (without the .nsc extension), then add chan_name.replace(".ncs","") and remove the int() function

3. `mne.create_info()` fails when there are multiple unique values in `sr` and `ch_type` is not defined:

LFPAnalysis/LFPAnalysis/lfp_preprocess_utils.py

Lines 656 to 657 in bacc1f8

 info = mne.create_info(ch_name, np.unique(sr), ch_type) 

 mne_data = mne.io.RawArray(lfp, info)

Should ch_type='seeg' here?

Also, when there are multiple unique values in sr, appending to lfp fails in the loop because axis dimensions must be consistent.

lfp_utils.ref_mne only checks for "Manual Examination" not "ManualExamination"

There are inconsistencies in our anatomical reconstruction naming conventions. Some people make their manual examination column name "Manual Examination" and others "ManualExamination" with no space. Currently the reref function only accounts for this column label WITH a space. Needs to accommodate both for WM, OOB, and YBA labels that are manually input from Saez lab members.

Add example data to repo

The Condensed Analysis notebook that should serve as a tutorial for new users currently relies on my data paths. It should, instead, utilize sample data that exists on the repo

Need make_epochs to account for naming differences between mne_data and recon labels

Lfp_preprocess_utils.make_epochs fails if there are typos (i vs l) in the recon labels. This problem is solved for make_mne, just needs to be applied to this function as well.

`sync_utils.pulsealign()` bug using `Tuple[np.ndarray, np.ndarray]`

Importing sync_utils with Python 3.10.8 throws this error:

[autoreload of LFPAnalysis.sync_utils failed: Traceback (most recent call last):
  File "/sc/arion/work/rhoads01/envs/lfp_env/lib/python3.10/site-packages/IPython/extensions/autoreload.py", line 261, in check
    superreload(m, reload, self.old_objects)
  File "/sc/arion/work/rhoads01/envs/lfp_env/lib/python3.10/site-packages/IPython/extensions/autoreload.py", line 459, in superreload
    module = reload(module)
  File "/sc/arion/work/rhoads01/envs/lfp_env/lib/python3.10/importlib/__init__.py", line 169, in reload
    _bootstrap._exec(spec, module)
  File "<frozen importlib._bootstrap>", line 619, in _exec
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/sc/arion/projects/guLab/Shawn/tools/LFPAnalysis/LFPAnalysis/sync_utils.py", line 23, in <module>
    def pulsealign(beh_ms: np.ndarray, pulses: np.ndarray, windSize: int = 30) -> Tuple[np.ndarray, np.ndarray]:
NameError: name 'Tuple' is not defined
]

I believe this is due to the capital T:

def pulsealign(beh_ms: np.ndarray, pulses: np.ndarray, windSize: int = 30) -> Tuple[np.ndarray, np.ndarray]:

Changing to lowercase removes the error.

def pulsealign(beh_ms: np.ndarray, pulses: np.ndarray, windSize: int = 30) -> tuple[np.ndarray, np.ndarray]:

Add spectral analysis and detailed statistics to Condensed Notebook

These include:

FOOOF for trial-averaged spectral characterization
wavelet TFRs for trial-level analysis
cluster-based permutation statistics
inter-trial phase coherence

match_elec_names fails when different # of electrodes in the label file and mne.ch_names

As noted by Parul for MS008, two electrodes are present in the elec csv that are not in the edf data (and thus the mne object).

%matplotlib notebook plots not rendering in VScode

%matplotlib notebook does not render the plot for omitting bad channels (see lines below) when the notebook's run using VScode.

%matplotlib notebook
fig = mne_data.plot(start=0, duration=120, n_channels=8, scalings=mne_data._data.max()/20)
fig.fake_keypress('a')

Grabbing data from `pairwise_connectivity` object in `oscillation_utils.compute_surr_connectivity_time`

When n_pairs==1, should use pairwise_connectivity[0,2] to grab coherence value instead of current implementation, which uses:

n_pairs = len(indices[0])
if n_pairs == 1:
    pairwise_connectivity = pairwise_connectivity.reshape((pairwise_connectivity.shape[0], n_pairs))

which could be replaced with:

n_pairs = len(indices[0])
if n_pairs == 1:
    pairwise_connectivity = pairwise_connectivity[:,2]

This is because np.squeeze(Connectivity_object.get_data())[:, 0]) yields all zero values, which is specifically caused by the [:,0] indexing because .get_data() outputs all the cells from the coherence matrix.

Maybe use Connectivity_object.get_data(output='dense') to check what should be used.

	lfp.append(fdata['data'])
	sr.append(fdata['sampling_rate'])
	ch_name.append(str(ch_num))

	info = mne.create_info(ch_name, np.unique(sr), ch_type)
	mne_data = mne.io.RawArray(lfp, info)

seqasim / lfpanalysis Goto Github PK

lfpanalysis's People

Contributors

Stargazers

Watchers

Forkers

lfpanalysis's Issues

1. lfp, sr, and ch_name are not defined on line 652:

2. Line 642 globs the all .ncs files in load_path but these might not match the pattern needed for the ch_num assignment:

3. mne.create_info() fails when there are multiple unique values in sr and ch_type is not defined:

Recommend Projects

Recommend Topics

Recommend Org

Jobs

1. `lfp`, `sr`, and `ch_name` are not defined on line 652:

2. Line 642 globs the all `.ncs` files in `load_path` but these might not match the pattern needed for the `ch_num` assignment:

3. `mne.create_info()` fails when there are multiple unique values in `sr` and `ch_type` is not defined: