Comments (2)
Hi Jake,
Thanks a lot for raising this issue. I finally got around to this as I'm planning a fairly large update to the master branch. It won't change the algorithm, but will make the code installable as a python package along with a bunch of changes in simrun.py to better showcase pSuStaIn's features.
I agree that the way the code was plain wrong. Looking at your proposed changes, I realized that if the user passes an array of folds it won't work, so maybe we can use this logic instead:
if select_fold != []:
Nfolds = len(select_fold)
else:
select_fold = test_idxs
Nfolds
This will set the number of folds to what the user passed in or use all folds if the user didn't pass anything in. It will also make sure that select_fold holds the folds to be run.
Then I added this:
for fold in range(Nfolds):
indx_train = np.array([x for x in range(self.__sustainData.getNumSamples()) if x not in select_fold[fold]])
indx_test = select_fold[fold]
Where I replaced test_idxs[fold] with select_fold[fold] in both lines.
What do you think of this?
from pysustain.
Hi Leon,
Thanks for looking into this. Sorry I didn't respond earlier. I just got around to having another look at this, and I'm still running into a similar issue. I think the source of issue is ultimately lack of documentation for this function.
So, as I learned from the tutorial notebook, test_idxs
is supposed to be a nested list (ie a list of lists). Specifically, there is a length n list of lists containing m indices, where n is the # of folds and m is the number of individuals in the test set for a given fold. (As an aside, this seems less intuitive to me than just an n x m array.)
Then, the user is is prompted to pass the select_fold
argument if he/she wishes in a parallel context. The default is an empty list, which to me is unintuitive. Why another list? I would have expected this argument to just be an integer, 0 through n, where n is the number of folds. I'm not sure what the actual argument is supposed to be? It appears from the default (and your prior comment) that it's supposed to be an array, but if the user already did the work to compile the list of lists in the first place, why pass another array for the select fold? I would find it easier to just pass an integer indicating which fold the user wants to use, and the solution I proposed (janky as it may be) allows that.
I may just be misunderstanding something, and this comment might be obviated by documentation explaining the expected input. But I couldn't find any documentation for this function, nor was there an example on the tutorial notebook.
Anyway, I've changed it locally and everything is fine, so consider this just a suggestion and no worries if you disagree! Just some food for thought.
<3
--Jake
from pysustain.
Related Issues (20)
- multiple sclerosis HOT 3
- Fixing controls in GMM HOT 2
- `use_parallel_startpoints` fails on numpy 1.21 HOT 6
- Fix for "rare" divide by zero problem HOT 6
- Idea: SusStaIn constraint with longitudinal measures HOT 6
- Adding a colourbar to PVD plots HOT 6
- IndexError while running the SuStaIn Workshop file HOT 2
- Data Preparation Pipeline/Code HOT 5
- How to interpret the Positional Variance Diagram HOT 1
- Question on Using pySuStaIn on ADNI HOT 3
- Parallel CV doesn't work (aka "Why do all my CV jobs run for fold 0 only???") HOT 1
- Ordinal Sustain Notebook HOT 3
- Mislabelled subtype numbers in PVDs HOT 9
- Parallelization fails -- TypeError: cannot pickle '_abc._abc_data' object HOT 4
- Allow for complete model reloading HOT 2
- minor installation issue with sklearn HOT 1
- [Question] Can we discover subtypes in a training test, and use the discovered subtypes to subtype subjects of a test set? HOT 1
- Example code for mixture_KDE HOT 3
- ValueError in AbstractSuStaIn HOT 10
- Enabling PVD Plot Legends HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pysustain.