guruucsd / lateralized-components Goto Github PK
View Code? Open in Web Editor NEWSubmission to OHBM 2016 on functional lateralization using the neurovault dataset.
Submission to OHBM 2016 on functional lateralization using the neurovault dataset.
First, the centroid of some components shifted position relative to their whole-brain counterparts
At this point this is purely observational. I think we should quantify it somehow, but do you know how best we can do it, @bcipolli ? I guess we can;
?
From @atsuch:
i also want to try a different way of RL matching. Rather than find a spatial match by directly comparing R and L components as currently implemented, do R- and L- matching through wb components
We backed this out due to query runtimes, but sounds like this problem has been solved. Let's get back on the latest code, so we can help QA it and get it into nilearn!
I was testing my code, and encountered the following;
File "main.py", line 365, in
components=components, query_server=query_server, plot=plot, **args)
File "main.py", line 254, in loop_main_and_plot
query_server=query_server)
File "main.py", line 60, in get_dataset
images, term_scores = fetch_neurovault(max_images=max_images, **kwargs)
File "/home/ami/Documents/Work/imaging/RL_ICA/OHBM-2016/nilearn_ext/datasets.py", line 174, in fetch_neurovault
images, term_scores = _neurovault_remove_bad_images(images, term_scores)
File "/home/ami/Documents/Work/imaging/RL_ICA/OHBM-2016/nilearn_ext/datasets.py", line 98, in _neurovault_remove_bad_images
is_rejected = np.all(dat > 0) or np.all(dat < 0) or (np.unique(dat).size < 2000)
RuntimeWarning: invalid value encountered in greater
Since I have ICA images, I don't really need to get new images unless I want to re-calculate ICA images with newer dataset, but currently we get dataset regardless. Maybe we should change that... :P But anyhow, dataset.py needs a fix, it seems..
From @atsuch: Add R-, L-, and RL-concat to wb comparison overlaid on top of each other in ps.py
Is our ability to match components good or bad? We can test by making random cuts of the datset, then seeing our ability to match (and the average value of the match dissimilarity metric).
Initial design:
MniMasker
that selects a given # of MNI voxels randomly.match-random.py
(or in match.py
with a --random
flag.We compute a matrix of similarities between right and left (or right and both, etc). Would be great to visualize the matrix (and label each row/column with relevant terms).
Similar to the bokeh-based matrix I did here: https://github.com/guruucsd/PING/blob/ec91ebf58103555810ea3396d522a25c7ef4a26d/ping/ping/utils/plotting.py#L101
I was working with 20 components, and found that there a few (1 for R, another for L) components with a very few voxels with extreme values, resulting in the overall dark image. This one example has the max value of 0.43, when other images have max values around 0.02. Any idea why we get this, @bcipolli ??
We were originally looking at sparsity, defined as the # of voxels above an arbitrary threshold, to compare WB vs hemi components. If there is no asymmetric activity, contrast in WB and hemi components should be similar. In other words, if we compare R hemi in WB components vs R components, the # of voxels above a given threshold should be similar between the two. Increased contrast (i.e. higher # of voxels) in hemi components suggests 'masking' of lateralized activity in WB analysis.
There were two issues with this. 1) We have to normalize the component maps to make scales comparable across component maps. This was suggested by Tomoki and we implemented this. Good. But 2) (also suggested by Tomoki) # of voxel count isn't a conventional way to measure sparsity, and it's affected by the choice of threshold. He suggested to use L1 norm instead.
So I was implementing this. But actually looking at the distribution of values in normalized images in ICA components, I started to think that the l1 sparsity and voxel count sparsity give different information, and it's relevant which one I use, since I use sparsity to look at two other aspects of our data.
One was the difference in sparsity on positive and negative side: although signs in ICA components are meaningless, some components have mostly activation on one side, while others have activation on both side more equally. In terms of brain, this can signify anti-correlated network, where activation in one region is coupled with deactivation in another area. This I think is an interesting aspect of our analysis using neurovault where we have whole-brain map as our input, in contrast with the similar activation meta-analysis based on the peak activation coordinates, like in BrainMap.
Second was what I called HPI, but maybe we should name it hemispheric participation asymmetry index, since it's really an AI. It's the difference in activation patterns in the two hemisphere (for WB analysis), and I was calculating it as (R-L)/(R+L) using the voxel count method. Because I was interested in HPI of anti-correlated network for those components with mixed positive and negative activations, I was calculating this separately for pos, neg , and abs sparsity.
To show you how two methods of sparsity gives a different picture, especially for detecting components with anti-correlated network, here I show the value distribution of one typical WB component image (n_components=20).
You can see that, like in a typical stat map used as the input image, it has a long tail on the positive side. If I use either 90 or 95 percentile threshold based on the absolute values of all of the 20 ICA map images (this histogram is just showing one out of 20), the voxel count is clearly larger in positive than negative side. But l1 sparsity for positive and negative values are almost identical, and this was true for all the other component images. The voxel count, on the other hand, showed varying degree of asymmetry (i.e we would see more warm-color voxels than cool-color voxels when plotting this component).
I think for the purpose of detecting components with anti-correlated networks, the voxel count is better, since l1 adds up values of the large number of voxels around zero, the stuff we don't really care about. I'm inclined to use the voxel count for HPAI as well. As I'm writing this, I realized that this is identical to the problems associated with Laterality Index calculation on a task-based fMRI study. Traditionally people used voxel counts asymmetry on the two hemisphere for a given statistical threshold. But others pointed out how it was affected by the choice of the threshold, and suggested more threshold-free methods... I have to find the paper describing the threshold-free calculation of LI, but for the time being I'm thinking of sticking to our original method of counting voxels with an arbitrary threshold... To compare sparsity/HPAI across different # of n_components, we should pick the same threshold across them. Looking at this distribution, 0.000005 seems OK, but maybe I can use 90 or 95 percentile of all the absolute values across all the component images (components = 5, 10, 15, 20... ).
At least on my machine, I haven't been able to test my new ps.py since my python always crash before completing all the component number and matching methods. @bcipolli, any idea why this might be the case?
I'm wondering if running get_dataset for every iteration of main.py is at least part of the problem? If that's the case, I'm thinking of adding the option to pass the images and term_scores to main.py rather than always running it inside main.py. What do you think?
Right now we dump these all to files, but doing the same thing in an interactive matrix plot would be not-so-hard and way more intuitive / informative.
Analysis.py has way too much code, most of it very complex. Simply breaking it into smaller files will help a lot in understandability.
Interactive plots are nice, but I think we need some way of summarizing the distinct laterality profiles of different brain regions. i.e. Which brain regions/networks consistently show evidence for symmetric/asymmetric functional organization? What are the lateralized functions and lateralized networks of the brain?
One way I can think of doing this, and there may be other ways, is to get best-match score map (and possibly other measures... I have to think a bit), by averaging the n maps, each a product of best-match score and component map for each ICA component. Then use existing functional network atlas (Yeo, Power?) and get a summary of matching performance per network....does it make any sense?
If you can think of a better way of summarizing things so we can infer something about specific brain networks from our analyses, do let me know @bcipolli !
Finally, our unilateral components had greater contrasts than the whole-brain counterparts. This again suggests that bilateral analysis leads to asymmetric activity being treated as “residual” to bilateral activity.
@bcipolli you mentioned that you were testing some code to answer the inference you were making...I assume it's about this part of our claim?
This is rather a fine point, but the hemi-maskers are not equal size, I think because R-hemi includes some voxels at the mid slice, and L-hemi doesn't.
This means that when we compare the spatial similarity of R and L by flipping one image, the results will be slightly different based on the choice of hemisphere masks. I know that the contribution of those mid voxels are minor, but it does affect matching etc. It might also influence ICA results slightly, depending on how much those voxels contribute to the components.
Given that we don't know which hemisphere they actually belong to, I'm inclined to remove those mid-slice voxels from either hemi-maskers/masks, but what do you think @bcipolli?
Should be easy enough to toggle, just need to make sure there are no errors/issues with having no terms....
Loading data, then comparing integer matrices is slow. Instead compute an md5 sum / hash, store it on the image metadata, and use this for comparison.
Allow re-using matches. For example, both has left and right finger movements as separate components. We get the RL components and match them, but then can only match up with one of the two both components.
Restricting usage of one component also affects lateralized functions like language. Nothing matces, so it grabs some randomly "similar" component.
The language issue also affects similarity score. Since that RL match is not compelling, it's going to decrease the similarity score to the language component in both. So perhaps computing similarity between RL and both by: matching R to both:R, L to both:L, and allowing reuse of matches (and then visualizing any components never chosen as the best match).
As for allowing to reuse matches, I thought about it too, just hadn't got around to discuss it with you. Whether we do comparisons your way (RL to both) or my way (R- or L- only to both), I wasn't sure if we should be forcing one-to-one matching for all the components. We are hypothesizing that some wb components will have good match than others, depending on how much the interhemispheric interactions are affecting the wb component. That's why I wanted to be able to compare match scores not just by rows but across the whole matrix.
Updated images from qc.py shows many duplicate images with different image id/meta_data. We need to somehow filter out duplicates so that they don't dominate ICA components. Eg. collections 410, 1886 (and probably many more).
The easiest way may be to use automatically-generated NV meta-data, such as brain_coverage and perch_bad_voxels (and perc_voxels_outside, although this is missing in some images..)...but this might not work, since when I checked # of unique combinations of these three in NV metadata, there were only about 2000 of them, when there are ~9000 unique images.
@bcipolli, can you tackle this problem?
@atsuch likes using "wb" for whole brain components. I'm game. But if we change our vocabulary, we should change the code!
When using the hemi_mask.fit in nilearn_ext/masking.py, I get a TypeError;
File "/home/ami/Documents/Work/imaging/RL_ICA/OHBM-2016/image_analysis/sparsity.py", line 60, in get_hemi_sparsity
gm_mask = get_mask_by_key(hemi)
File "/home/ami/Documents/Work/imaging/RL_ICA/OHBM-2016/nilearn_ext/masking.py", line 131, in get_mask_by_key
gm_imgs = split_bilateral_rois(gm_img)
File "/home/ami/Documents/Work/imaging/RL_ICA/OHBM-2016/nilearn_ext/masking.py", line 103, in split_bilateral_rois
hemi_mask.fit(map_img)
File "/home/ami/Documents/Work/imaging/RL_ICA/OHBM-2016/nilearn_ext/masking.py", line 195, in fit
hemi_mask_data[other_hemi_slice] = False
TypeError: slice indices must be integers or None or have an index method
I vaguely remember getting a deprecation warning before... but anyway wanted to check with you @bcipolli.
This was suggested by @guruucsd - combine all hemis to one side, then do ICA to sop up common components, and see what's left.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.