Comments (14)
FYI @jacobf18 and I talked briefly today. He found an issue that he'll push up a PR for in the autocorrelation
function.
from mne-icalabel.
@mscheltienne reposting here as an issue to begin discussion again.
Quick update: our intern should have started on the 1st of March. But he did not yet get his work permit (mandatory in Switzerland), thus his start has been delayed and we hope he will be on board next week.
Okay let us know when they are on board and we can schedule another time to chat.
I finally took the time to test it! It looks very promising but seems to favor a lot 'Other'. Am I correct in assuming that the classes outputted are in the order:
classes = ['Brain', 'Muscle', 'Eye', 'Heart', 'Line Noise', 'Channel Noise', 'Other']
Yes this order is correct.
So last time @jacobf18 and I spoke, we identified this issue as well. There are three main components to the ICLabel pipeline:
- generate features from data (features)
- format the features to be fed into the network (formatting)
- actually run the data through the network (network)
So according to Jacob, he verified the network weights match that of EEGlab by running through various examples and confirming the outputs match numerically. However, a unit test that verifies this would be nice.
He also manually tested the formatting part.
We unit tested the generation of features on simulation data + raw loaded EEGLab data and all are matching numerically with EEGlab except for rpsd
, which has a random component to it, which makes it difficult to match up w/ EEGlab.
I believe the differences in output can stem from any of these three parts, so we need to really be systematic in our testing of each of the three components of the pipeline against EEGLab in order to pinpoint exactly where are the differences.
Eye and heartbeat are correct. The last time we talked, you told me it failed at classifying even simple blinks, did you find what caused this?
Yes that was an incorrect specification of the sampling rate, which I fixed.
from mne-icalabel.
Alright, I think the first task for @anandsaini024 will be to run the network on Python and MATLAB, compare the outputs, familiarize himself with the code. He can also work on the unit test to verify that the network weights match and to verify the formatting part (if you don't beat him to it 😉). As you said, we have to be very systematic in our testing approach and we have to compare MATLAB and Python output at every stage. For rpsd
, I wonder if there is a way to use the same random generator between both implementations, but let's put that one aside for now.
At the moment, he is still waiting for his work permit, but he will pick up his laptop and a MATLAB license this week or at the beginning of next week.
from mne-icalabel.
Notes 3/23/22: cc: @jacobf18
Short term goals (to aim to complete by summer)
Our short term goal is to exactly replicate all aspects of EEGLab as possible. This would result in v0.1
of mne-icalabel, which we could submit a short paper to JOSS for generating a DOI for this initial version.
It is possible again that the two runs through Matlab/Python do not match. For example, a unit test for network weights, feature generation and data pipeline would be enough to convince us that all differences in the two networks are solely from a "randomness" injected in the rpsd
feature sets.
-
Anand to work on a unit test of EEGLab's weights vs Pytorch weights that Jacob ported under
assets/
-
API design TODO: should be compatible with MNE-python.
Longer term goals
The longer term goal is to really improve the IClabeler. This would involve two efforts: improving the model and improving the benchmarking. The model needs to be benchmarked, so the benchmark is the bulk of the effort. The benchmarking needs to account for real-world variability and requires high-quality training/testing data. The result of this would be v0.2+
of mne-icalabel and could result in a full-fledged study.
-
Future improvements of the network: propose other designs that are benchmarked on a diverse dataset. See point below.
-
Benchmarking: incorporate a suite of montages, recording systems, number of electrodes to determine the true "performance" of an ICLabel labeler. It's unclear from the original publication. @mscheltienne can provide 1 subject per system (up to say 20 subjects). Note that all these datasets would be initially "Unlabeled"
Datasets available: NK (I think 1020), EGI, ...
Other "semi-labeled" datasets could come from Alexandre G. from https://swipe4ica.github.io/#/.
Reference:
from mne-icalabel.
@agramfort we are thinking of leveraging https://swipe4ica.github.io/#/ to build out an independent and bigger dataset to improve the ICLabel ported over from EEGLab.
I was thinking of having a hs student interested in working with me to contribute to this.
- How easy is it to get access to this dataset from the web app?
- Is it ready for usage?
@mscheltienne in addition, you mentioned you have some external data. How difficult would it be to convert that data to BIDS and then we could have someone just run through all the data (i.e. this hs student) and label the IC components by hand? The main work on me/us would be we need to set up an almost-tutorial like ipython notebook for him to understand and sort of conceptualize what he's doing.
from mne-icalabel.
from mne-icalabel.
All our datasets have to be converted to a common format anyway, so let's go with BIDS. For the ANT Neuro and EGI datasets I can provide, I don't think it will be too much work to convert them to BIDS and to save the ICA decomposition in a BIDS format.
from mne-icalabel.
Note for @adam2392 I need to push up the datafiles under tests/data
and modify the .gitignore
from mne-icalabel.
FYI: @mscheltienne and @anandsaini024 #11
from mne-icalabel.
Now that this is "correct", should we release v0.1 on pypi?
from mne-icalabel.
I guess we can. Just one question, do we want to be backward compatible/use deprecation cycle from release 0.1 or can we keep some flexibility until a later release, e.g. when moved to mne-tools
?
I'm thinking about the output of the main label_components
function, currently, an array of shape (n_components, n_classes)
, which is general and can be applied to any model; but might not be super user friendly, since you still have to figure out which component is in which class. Although I am not favorable either to a unique method, e.g. a hard threshold, to attribute a component to a class.
from mne-icalabel.
Fair point. Let's finalize this API issue first before moving to v0.1.
Do you know where EEGLab documents the output order of the predicted class probabilities? I think the one I wrote in the examples is right, but it would be nice to double check this and have a reference to something.
Re how to output:
- We can perhaps structure it like a sklearn Transformer, since it just transforms the raw data
- We can have a kwarg to either return predicted probabilities or the argmax and string labels?
- Other ideas?
from mne-icalabel.
Does this count as documentation? It's probably where I found it the first time..
https://github.com/sccn/ICLabel/blob/e8abc99e0c371ff49eff115cf7955fafc7f7969a/iclabel.m#L60-L62
I'll have a look at the Transformer
, I am not familiar with it.
from mne-icalabel.
Does this count as documentation? It's probably where I found it the first time..
https://github.com/sccn/ICLabel/blob/e8abc99e0c371ff49eff115cf7955fafc7f7969a/iclabel.m#L60-L62
I'll have a look at the
Transformer
, I am not familiar with it.
Ah yes that counts :p
from mne-icalabel.
Related Issues (20)
- [GUI] Missing or non-expandable ICA properties HOT 1
- [DOC] Clarify the types of data to be used HOT 2
- [JOSS] Nitpicks on the paper HOT 1
- No module named 'numpy.typing' HOT 2
- mne_icalabel ImportError HOT 4
- Example render for 0.4 stable website HOT 4
- Possibly want to freeze GH actions at Ubuntu 20.04? HOT 1
- ValueError: need at least one array to concatenate HOT 6
- ONNX/Tensorflow Port HOT 3
- "arrays used as indices must be of integer (or boolean) type" HOT 14
- Set black to max_chars=88 per line
- ICLabel doesn't work with combined MEG/EEG data
- Evaluate and support MEGNet HOT 2
- Guidelines to contribute a new model HOT 4
- Dropdown version selector is broken HOT 1
- Use of IClabel with MEG data. HOT 2
- DOC: tables in dark mode are unreadable HOT 1
- Outdated docs for label_components() HOT 1
- Re-build stable documentation HOT 2
- mne_icalabel ImportError HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mne-icalabel.