GithubHelp home page GithubHelp logo

lfpanalysis's People

Contributors

aliefink avatar christinamaher avatar fuqixiu avatar seqasim avatar shawnrhoads avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

lfpanalysis's Issues

Missing custom ROI label excel after pip install

Unable to determine custom ROI label using analysis_utils.select_rois_picks for bipolar re-referenced channels because custom ROI excel (YBA_ROI_labelled.xlsx) is missing from package data following pip install.
Recommendations:

  • Update setup.py to include include_package_data=True
  • Save YBA_ROI_labelled data as .csv instead of .xlsx
  • Add optional argument to analysis_utils.select_rois_picks to allow individualized custom ROI labels

issues with importing local environment.yml

Hey Salman, I was debugging with Shawn to import your environment locally. I think you need to re-export your environment on Mac using this command in Anaconda prompt, so it doesn't have extra build information with packages.

Windows:
conda env export --no-builds | findstr -v "prefix" > environment.yml
Mac:
conda env export --no-builds | grep -v "prefix" > environment.yml

In addition, I created a temporary testing folder to test env installation.

environment.yml set up module issues

I followed the environment installation instruction on my new desktop computer. Installation and upgrade all worked well. I was able to change the kernel and interpreter in VScode under lfp_env. However, there are number of modules were not available for import (including FOOOF). Are we expected to manually add the modules in conda lfp_env after installation? (I added scipy and seaborn manually in anaconda navigator)

Thanks for any help.

Screenshot 2023-10-17 094633

speed up pre-processing code

currently, it takes ~ 3 minutes to load and re-reference neural data for one subject. This is too slow. Presumably most of the time wasted has to do with the iterative process for white-matter re-referencing. There should be a more efficient (read: vectorized) way to do this and save time.

Repeated downsampling

Current approach downsamples once at the beginning of preprocessing and again during make_epochs function. Consider removing second downsampling step.

lfp.preprocess_utils.wm_ref() misses oob/wm elecs that were not manually examined

The function seems to skip oob and wm electrodes that do not have any label in their associated manual examination column. Removing the auto detect step from the for loop that creates wm/oob indices seems to resolve this issue.

wm_elec_ix_manual = [] 
wm_elec_ix_auto = []
oob_elec_ix_manual = []
oob_elec_ix_auto = []

if 'Manual Examination' in elec_data.keys():
    wm_elec_ix_manual = wm_elec_ix_manual + [ind for ind, data in elec_data['Manual Examination'].str.lower().items() if data=='wm' and elec_data['label'].str.lower()[ind] not in bad_channels]
    oob_elec_ix_manual = [ind for ind, data in elec_data['Manual Examination'].str.lower().items() if data=='oob']
elif 'ManualExamination' in elec_data.keys():
    wm_elec_ix_manual = wm_elec_ix_manual + [ind for ind, data in elec_data['ManualExamination'].str.lower().items() if data=='wm' and elec_data['label'].str.lower()[ind] not in bad_channels]
    oob_elec_ix_manual = oob_elec_ix = [ind for ind, data in elec_data['ManualExamination'].str.lower().items() if data=='oob']  

wm_elec_ix_auto = wm_elec_ix_auto + [ind for ind, data in elec_data['gm'].str.lower().items() if data=='white' and elec_data['label'].str.lower()[ind] not in bad_channels]
oob_elec_ix_auto = [ind for ind, data in elec_data['gm'].str.lower().items() if data=='unknown']

wm_elec_ix = np.unique(wm_elec_ix_manual + wm_elec_ix_auto)
oob_elec_ix = np.unique(oob_elec_ix_manual + oob_elec_ix_auto)
all_ix = elec_data.index.values
gm_elec_ix = np.array([x for x in all_ix if x not in wm_elec_ix and x not in oob_elec_ix])

mislabeling in elec_df

In the condensed notebook, the following line is used to replace the monopolar labels in elec_df with the referenced labels in your mne data .ch_names:

elec_df['label'] = epochs_all_subjs_all_evs[subj_id][event].ch_names

As noted by @shawnrhoads this has bug potential if the .ch_names are not in the same order as the elec_df labels!!

nlx_utils throws a lot of warnings while parsing header and datetime from .ncs files

I don't know the implications of this yet, but worth investigating

Lines 76-82

    # Try to read the original file path
    try:
        assert hdr_lines[1].split()[1:3] == ['File', 'Name']
        hdr[u'FileName']  = ' '.join(hdr_lines[1].split()[3:])
        # hdr['save_path'] = hdr['FileName']
    except:
        warnings.warn('Unable to parse original file path from Neuralynx header: ' + hdr_lines[1])

Lines 90-98

    # Read the parameters, assuming "-PARAM_NAME PARAM_VALUE" format
    for line in hdr_lines[4:]:
        try:
            name, value = line[1:].split()  # Ignore the dash and split PARAM_NAME and PARAM_VALUE
            hdr[name] = value
        except:
            warnings.warn('Unable to parse parameter line from Neuralynx header: ' + line)

    return hdr

Lines 128-139

    # Parse a datetime object from the idiosyncratic time string in Neuralynx file headers
    try:
        tmp_date = [int(x) for x in time_string.split()[4].split('/')]
        tmp_time = [int(x) for x in time_string.split()[-1].replace('.', ':').split(':')]
        tmp_microsecond = tmp_time[3] * 1000
    except:
        warnings.warn('Unable to parse time string from Neuralynx header: ' + time_string)
        return None
    else:
        return datetime.datetime(tmp_date[2], tmp_date[0], tmp_date[1],  # Year, month, day
                                 tmp_time[0], tmp_time[1], tmp_time[2],  # Hour, minute, second
                                 tmp_microsecond)

add site-specificity to data loading code

Sometimes we record across different hospitals, which have different naming conventions etc. Either there should be one universal solution to loading this code (lol) or we should have site as an input to functions that load code with if condns to do site-specific data wrangling

mni x y z coordinates as float to calculate wm electrode distance

wm_elec_dist = np.linalg.norm(elec_data.loc[wm_elec_ix, ['x', 'y', 'z']].values.astype(float) - elec_loc, axis=1)

in line 250 of LFPAnalysis/lfp_preprocess_utils.py instead of

wm_elec_dist = np.linalg.norm(elec_data.loc[wm_elec_ix, ['x', 'y', 'z']].values - elec_loc, axis=1)

'mne.io' has no attribute 'fif'

Bug detected in lfp_preprocess_utils.py detect_IEDs function line 567 elif type(mne_data) == mne.io.fif.raw.Raw

   549 elif type(mne_data) == mne.io.fif.raw.Raw: 
    550     data_type = 'continuous'
    551     n_times = mne_data._data.shape[1]

AttributeError: module 'mne.io' has no attribute 'fif'

This can fixed by chaging the line to elif type(mne_data) == mne.io.Raw

Errors with electrode name matching

Sometimes the electrode names from the .edf file do no match those from the electrode label sheet. We use Levenshtein distance to mitigate this, but this sometimes results in 'ties' between the correct match and another similar electrode name.

re-factor pre-processing and artifact detection

I want to change up the way we pre-process the data and detect IEDs, to a) improve data quality, b) reduce data rejection rates, and c) minimize inter-researcher variability from manual decision-making. To that end, I want to:

  1. Remove manual channel rejection prior to re-referencing
  2. Only perform artifact identification and rejection prior to epoching + baselining
  3. Revise IED detection to a higher z-score threshold (4->5)
  4. Add changepoint detection in the raw data with the same z-score threshold (z=5)
  5. NaN out the 100 ms before and after each artifact detected

References:

#3 - #5

Add better support for continuous, non-epoched data analysis

In particular, people may want to perform sliding window analyses of resting state data. In this case, my favorite approach would be to use MNE's built in tool for making fixed-length epochs (e.g. 500 ms) with an overlap/step size (e.g. 250 ms), like so:

epochs = make_fixed_length_epochs(raw=raw, duration=0.5, overlap=0.25)

This is pretty powerful as it can basically turn any epoched analysis we perform in this repo into a sliding window analysis of longer stretches of data!

Some bugs in `make_mne()` for nlx format

I ran into a few issues trying to load the .ncs files from neuralynx into an mne object. Attaching my notes and potential solutions.

1. lfp, sr, and ch_name are not defined on line 652:

lfp.append(fdata['data'])
sr.append(fdata['sampling_rate'])
ch_name.append(str(ch_num))

Potential solution: add the following before the nsc_files loop, assuming lfp is a numpy array and the rest are lists (which is what mne.create_info() and mne.io.RawArray() require:

lfp = np.array([])
sr = []
ch_name = []

Then lines 652-654 should be replaced with something like (but see point 3 below for issue with lfp when channels are different sizes):

if lfp.size == 0:
    lfp = fdata['data']
else:
    lfp = np.vstack([lfp, fdata['data']])
sr.append(fdata['sampling_rate']) 
ch_name.append(str(ch_num))

2. Line 642 globs the all .ncs files in load_path but these might not match the pattern needed for the ch_num assignment:

ch_num = int(chan_name[4:])

Potential solution 1: If ch_name should be referring to files that contain the pattern _0000.nsc to _9999.nsc (assuming this because this value is being converted to an integer), then maybe add something like this in place of line 642:

pattern = re.compile(r"_\d{4}\.ncs")  # regex pattern to match "_0000.ncs" to "_9999.ncs"
ncs_files = [x for x in glob(f'{load_path}/*.ncs') if re.search(pattern, x)]

If the case, then would also need to change line 646 to ch_num = int(chan_name[-4:])

Potential solution 2: If ch_name should be referring to the base name of each file (without the .nsc extension), then add chan_name.replace(".ncs","") and remove the int() function

3. mne.create_info() fails when there are multiple unique values in sr and ch_type is not defined:

info = mne.create_info(ch_name, np.unique(sr), ch_type)
mne_data = mne.io.RawArray(lfp, info)

Should ch_type='seeg' here?

Also, when there are multiple unique values in sr, appending to lfp fails in the loop because axis dimensions must be consistent.

lfp_utils.ref_mne only checks for "Manual Examination" not "ManualExamination"

There are inconsistencies in our anatomical reconstruction naming conventions. Some people make their manual examination column name "Manual Examination" and others "ManualExamination" with no space. Currently the reref function only accounts for this column label WITH a space. Needs to accommodate both for WM, OOB, and YBA labels that are manually input from Saez lab members.

Add example data to repo

The Condensed Analysis notebook that should serve as a tutorial for new users currently relies on my data paths. It should, instead, utilize sample data that exists on the repo

`sync_utils.pulsealign()` bug using `Tuple[np.ndarray, np.ndarray]`

Importing sync_utils with Python 3.10.8 throws this error:

[autoreload of LFPAnalysis.sync_utils failed: Traceback (most recent call last):
  File "/sc/arion/work/rhoads01/envs/lfp_env/lib/python3.10/site-packages/IPython/extensions/autoreload.py", line 261, in check
    superreload(m, reload, self.old_objects)
  File "/sc/arion/work/rhoads01/envs/lfp_env/lib/python3.10/site-packages/IPython/extensions/autoreload.py", line 459, in superreload
    module = reload(module)
  File "/sc/arion/work/rhoads01/envs/lfp_env/lib/python3.10/importlib/__init__.py", line 169, in reload
    _bootstrap._exec(spec, module)
  File "<frozen importlib._bootstrap>", line 619, in _exec
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/sc/arion/projects/guLab/Shawn/tools/LFPAnalysis/LFPAnalysis/sync_utils.py", line 23, in <module>
    def pulsealign(beh_ms: np.ndarray, pulses: np.ndarray, windSize: int = 30) -> Tuple[np.ndarray, np.ndarray]:
NameError: name 'Tuple' is not defined
]

I believe this is due to the capital T:

def pulsealign(beh_ms: np.ndarray, pulses: np.ndarray, windSize: int = 30) -> Tuple[np.ndarray, np.ndarray]:

Changing to lowercase removes the error.

def pulsealign(beh_ms: np.ndarray, pulses: np.ndarray, windSize: int = 30) -> tuple[np.ndarray, np.ndarray]:

%matplotlib notebook plots not rendering in VScode

%matplotlib notebook does not render the plot for omitting bad channels (see lines below) when the notebook's run using VScode.

%matplotlib notebook
fig = mne_data.plot(start=0, duration=120, n_channels=8, scalings=mne_data._data.max()/20)
fig.fake_keypress('a')

Grabbing data from `pairwise_connectivity` object in `oscillation_utils.compute_surr_connectivity_time`

When n_pairs==1, should use pairwise_connectivity[0,2] to grab coherence value instead of current implementation, which uses:

n_pairs = len(indices[0])
if n_pairs == 1:
    pairwise_connectivity = pairwise_connectivity.reshape((pairwise_connectivity.shape[0], n_pairs))   

which could be replaced with:

n_pairs = len(indices[0])
if n_pairs == 1:
    pairwise_connectivity = pairwise_connectivity[:,2]

This is because np.squeeze(Connectivity_object.get_data())[:, 0]) yields all zero values, which is specifically caused by the [:,0] indexing because .get_data() outputs all the cells from the coherence matrix.

Maybe use Connectivity_object.get_data(output='dense') to check what should be used.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.