guruucsd / lateralized-components Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 2.0 233 KB

Submission to OHBM 2016 on functional lateralization using the neurovault dataset.

Python 100.00%

lateralized-components's People

Contributors

Watchers

Forkers

atsuch bcipolli

lateralized-components's Issues

Quantifying the centroid shift

First, the centroid of some components shifted position relative to their whole-brain counterparts

At this point this is purely observational. I think we should quantify it somehow, but do you know how best we can do it, @bcipolli ? I guess we can;

extract the peak coordinates of component images
get Euclidean distance of the centroids

Try new spatial matching methods

From @atsuch:

i also want to try a different way of RL matching. Rather than find a spatial match by directly comparing R and L components as currently implemented, do R- and L- matching through wb components

Re-integrate latest neurovault fetcher

We backed this out due to query runtimes, but sounds like this problem has been solved. Let's get back on the latest code, so we can help QA it and get it into nilearn!

RuntimeWarning error when getting dataset

I was testing my code, and encountered the following;

File "main.py", line 365, in
components=components, query_server=query_server, plot=plot, **args)
File "main.py", line 254, in loop_main_and_plot
query_server=query_server)
File "main.py", line 60, in get_dataset
images, term_scores = fetch_neurovault(max_images=max_images, **kwargs)
File "/home/ami/Documents/Work/imaging/RL_ICA/OHBM-2016/nilearn_ext/datasets.py", line 174, in fetch_neurovault
images, term_scores = _neurovault_remove_bad_images(images, term_scores)
File "/home/ami/Documents/Work/imaging/RL_ICA/OHBM-2016/nilearn_ext/datasets.py", line 98, in _neurovault_remove_bad_images
is_rejected = np.all(dat > 0) or np.all(dat < 0) or (np.unique(dat).size < 2000)
RuntimeWarning: invalid value encountered in greater

Since I have ICA images, I don't really need to get new images unless I want to re-calculate ICA images with newer dataset, but currently we get dataset regardless. Maybe we should change that... :P But anyhow, dataset.py needs a fix, it seems..

QC : potentially bad images

Collection 1385; image 19228 is an example, others look similarly.

Collection 1003, image 13892 is an example. Metadata does not indicate parcellation.

Image 13920 is all white:

Plot different comparison methods and sources in ps.py

From @atsuch: Add R-, L-, and RL-concat to wb comparison overlaid on top of each other in ps.py

Generate null distribution of matching capability, via random dataset cuts.

Is our ability to match components good or bad? We can test by making random cuts of the datset, then seeing our ability to match (and the average value of the match dissimilarity metric).

Initial design:

Create a subclass of MniMasker that selects a given # of MNI voxels randomly.
Use that mask to separate the voxels into two sets.
Run the rest of the matching procedure as-is, in a file called match-random.py (or in match.py with a --random flag.

Visualization: confusion matrix between R and L components

We compute a matrix of similarities between right and left (or right and both, etc). Would be great to visualize the matrix (and label each row/column with relevant terms).

Bad components

I was working with 20 components, and found that there a few (1 for R, another for L) components with a very few voxels with extreme values, resulting in the overall dark image. This one example has the max value of 0.43, when other images have max values around 0.02. Any idea why we get this, @bcipolli ??

Sparsity and HPAI

We were originally looking at sparsity, defined as the # of voxels above an arbitrary threshold, to compare WB vs hemi components. If there is no asymmetric activity, contrast in WB and hemi components should be similar. In other words, if we compare R hemi in WB components vs R components, the # of voxels above a given threshold should be similar between the two. Increased contrast (i.e. higher # of voxels) in hemi components suggests 'masking' of lateralized activity in WB analysis.

There were two issues with this. 1) We have to normalize the component maps to make scales comparable across component maps. This was suggested by Tomoki and we implemented this. Good. But 2) (also suggested by Tomoki) # of voxel count isn't a conventional way to measure sparsity, and it's affected by the choice of threshold. He suggested to use L1 norm instead.

So I was implementing this. But actually looking at the distribution of values in normalized images in ICA components, I started to think that the l1 sparsity and voxel count sparsity give different information, and it's relevant which one I use, since I use sparsity to look at two other aspects of our data.

One was the difference in sparsity on positive and negative side: although signs in ICA components are meaningless, some components have mostly activation on one side, while others have activation on both side more equally. In terms of brain, this can signify anti-correlated network, where activation in one region is coupled with deactivation in another area. This I think is an interesting aspect of our analysis using neurovault where we have whole-brain map as our input, in contrast with the similar activation meta-analysis based on the peak activation coordinates, like in BrainMap.

Second was what I called HPI, but maybe we should name it hemispheric participation asymmetry index, since it's really an AI. It's the difference in activation patterns in the two hemisphere (for WB analysis), and I was calculating it as (R-L)/(R+L) using the voxel count method. Because I was interested in HPI of anti-correlated network for those components with mixed positive and negative activations, I was calculating this separately for pos, neg , and abs sparsity.

To show you how two methods of sparsity gives a different picture, especially for detecting components with anti-correlated network, here I show the value distribution of one typical WB component image (n_components=20).

You can see that, like in a typical stat map used as the input image, it has a long tail on the positive side. If I use either 90 or 95 percentile threshold based on the absolute values of all of the 20 ICA map images (this histogram is just showing one out of 20), the voxel count is clearly larger in positive than negative side. But l1 sparsity for positive and negative values are almost identical, and this was true for all the other component images. The voxel count, on the other hand, showed varying degree of asymmetry (i.e we would see more warm-color voxels than cool-color voxels when plotting this component).

I think for the purpose of detecting components with anti-correlated networks, the voxel count is better, since l1 adds up values of the large number of voxels around zero, the stuff we don't really care about. I'm inclined to use the voxel count for HPAI as well. As I'm writing this, I realized that this is identical to the problems associated with Laterality Index calculation on a task-based fMRI study. Traditionally people used voxel counts asymmetry on the two hemisphere for a given statistical threshold. But others pointed out how it was affected by the choice of the threshold, and suggested more threshold-free methods... I have to find the paper describing the threshold-free calculation of LI, but for the time being I'm thinking of sticking to our original method of counting voxels with an arbitrary threshold... To compare sparsity/HPAI across different # of n_components, we should pick the same threshold across them. Looking at this distribution, 0.000005 seems OK, but maybe I can use 90 or 95 percentile of all the absolute values across all the component images (components = 5, 10, 15, 20... ).

python crash when running ps.py

At least on my machine, I haven't been able to test my new ps.py since my python always crash before completing all the component number and matching methods. @bcipolli, any idea why this might be the case?

I'm wondering if running get_dataset for every iteration of main.py is at least part of the problem? If that's the case, I'm thinking of adding the option to pass the images and term_scores to main.py rather than always running it inside main.py. What do you think?

Make interactive similarity matrix with pop-ups that show the compared components side-by-side

Right now we dump these all to files, but doing the same thing in an interactive matrix plot would be not-so-hard and way more intuitive / informative.

Refactor analysis.py

Analysis.py has way too much code, most of it very complex. Simply breaking it into smaller files will help a lot in understandability.

Summarizing the results to say something about properties of distinct networks

Interactive plots are nice, but I think we need some way of summarizing the distinct laterality profiles of different brain regions. i.e. Which brain regions/networks consistently show evidence for symmetric/asymmetric functional organization? What are the lateralized functions and lateralized networks of the brain?

One way I can think of doing this, and there may be other ways, is to get best-match score map (and possibly other measures... I have to think a bit), by averaging the n maps, each a product of best-match score and component map for each ICA component. Then use existing functional network atlas (Yeo, Power?) and get a summary of matching performance per network....does it make any sense?

If you can think of a better way of summarizing things so we can infer something about specific brain networks from our analyses, do let me know @bcipolli !

Testing the significance of increased contrasts in unilateral components

Finally, our unilateral components had greater contrasts than the whole-brain counterparts. This again suggests that bilateral analysis leads to asymmetric activity being treated as “residual” to bilateral activity.

@bcipolli you mentioned that you were testing some code to answer the inference you were making...I assume it's about this part of our claim?

Voxel size differences in R and L maskers

This is rather a fine point, but the hemi-maskers are not equal size, I think because R-hemi includes some voxels at the mid slice, and L-hemi doesn't.

This means that when we compare the spatial similarity of R and L by flipping one image, the results will be slightly different based on the choice of hemisphere masks. I know that the contribution of those mid voxels are minor, but it does affect matching etc. It might also influence ICA results slightly, depending on how much those voxels contribute to the components.

Given that we don't know which hemisphere they actually belong to, I'm inclined to remove those mid-slice voxels from either hemi-maskers/masks, but what do you think @bcipolli?

Look at how consistent whole-brain vs. R/L-only are, across datasets. (was: Re-run analysis (without Neurovault terms) on nyu resting state dataset)

Should be easy enough to toggle, just need to make sure there are no errors/issues with having no terms....

For deduping, don't compare images; compare image hashes

Loading data, then comparing integer matrices is slow. Instead compute an md5 sum / hash, store it on the image metadata, and use this for comparison.

Should we make every component match unique?

@bcipolli:

Allow re-using matches. For example, both has left and right finger movements as separate components. We get the RL components and match them, but then can only match up with one of the two both components.

Restricting usage of one component also affects lateralized functions like language. Nothing matces, so it grabs some randomly "similar" component.

The language issue also affects similarity score. Since that RL match is not compelling, it's going to decrease the similarity score to the language component in both. So perhaps computing similarity between RL and both by: matching R to both:R, L to both:L, and allowing reuse of matches (and then visualizing any components never chosen as the best match).

@atsuch:

As for allowing to reuse matches, I thought about it too, just hadn't got around to discuss it with you. Whether we do comparisons your way (RL to both) or my way (R- or L- only to both), I wasn't sure if we should be forcing one-to-one matching for all the components. We are hypothesizing that some wb components will have good match than others, depending on how much the interhemispheric interactions are affecting the wb component. That's why I wanted to be able to compare match scores not just by rows but across the whole matrix.

Duplicate images in the Neurovault

Updated images from qc.py shows many duplicate images with different image id/meta_data. We need to somehow filter out duplicates so that they don't dominate ICA components. Eg. collections 410, 1886 (and probably many more).

The easiest way may be to use automatically-generated NV meta-data, such as brain_coverage and perch_bad_voxels (and perc_voxels_outside, although this is missing in some images..)...but this might not work, since when I checked # of unique combinations of these three in NV metadata, there were only about 2000 of them, when there are ~9000 unique images.

@bcipolli, can you tackle this problem?

"both" => "wb"

@atsuch likes using "wb" for whole brain components. I'm game. But if we change our vocabulary, we should change the code!

hemi_mask_data TypeError

When using the hemi_mask.fit in nilearn_ext/masking.py, I get a TypeError;

File "/home/ami/Documents/Work/imaging/RL_ICA/OHBM-2016/image_analysis/sparsity.py", line 60, in get_hemi_sparsity
gm_mask = get_mask_by_key(hemi)
File "/home/ami/Documents/Work/imaging/RL_ICA/OHBM-2016/nilearn_ext/masking.py", line 131, in get_mask_by_key
gm_imgs = split_bilateral_rois(gm_img)
File "/home/ami/Documents/Work/imaging/RL_ICA/OHBM-2016/nilearn_ext/masking.py", line 103, in split_bilateral_rois
hemi_mask.fit(map_img)
File "/home/ami/Documents/Work/imaging/RL_ICA/OHBM-2016/nilearn_ext/masking.py", line 195, in fit
hemi_mask_data[other_hemi_slice] = False
TypeError: slice indices must be integers or None or have an index method

I vaguely remember getting a deprecation warning before... but anyway wanted to check with you @bcipolli.

Instead of matching procedure, try combining all hemispheres to the same side

This was suggested by @guruucsd - combine all hemis to one side, then do ICA to sop up common components, and see what's left.

Try removing threshholded maps

One map is always like this. I was never sure of what it meant; now I think it's due to threshholded maps.

guruucsd / lateralized-components Goto Github PK

lateralized-components's People

Contributors

Watchers

Forkers

lateralized-components's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs