neuronets / trained-models Goto Github PK

View Code? Open in Web Editor NEW

19.0 19.0 15.0 10.6 MB

Trained TensorFlow models for 3D image processing

Home Page: https://neuronets.dev/trained-models

Dockerfile 7.46% Python 92.54%

3d-models keras-models neuroimaging pretrained-models segmentation tensorflow-models

trained-models's People

Stargazers

Watchers

Forkers

frazavip hoda1394 wazeerzulfikar dhritimandas neuroneural sapdjeq ngustafson2023 hepsdata surendra116083 kabilar gaiborjosue n-shamsi hvgazula seymour88

trained-models's Issues

Add models to osf-storage using datalad in an automatic workflow

@gaiborjosue See here for a cleaner way of adding models to the osf-storage without mixing up git-annex and datalad.

Let me know if you need more clarity about how all these tools work. I suggest we should stay away from git annex add-url as we have done in the past.

Include a license file for every model that's included

          @Hoda1394  and @kaczmarj - renamed this repo, so it would allow us to track models that others have published as well. let's include a license file for every model that's included.

Originally posted by @satra in #4 (comment)

@satra Can you please advise about the type of license file to add?

Error pushing models to osf when adding models by git annex addurl

I tried to push a model to osf after adding it with git annex addurl and I got an error. here is what I did:

While in the root path of trained_model:

mkdir UCL/SynthSeg/2.0.0/conventional/weights
git annex addurl --relaxed --file UCL/SynthSeg/2.0.0/conventional/weights/synthseg_2.0.h5 https://www.dropbox.com/s/nu8ap1iicmute3y/synthseg_2.0.h5?dl=1
datalad get UCL/SynthSeg/2.0.0/conventional/weights/synthseg_2.0.h5
datalad push --to origin

I am getting this error:
[ERROR ] KeyError('bytesize') (KeyError)

and push gets aborted. I tested the downloaded model file with this link and it works. so it seems the model file is OK.
the original dropbox link has dl=0 at the end and by changing it to dl=1 it is possible to get the file with curl or wget and the file works fine when used for inference.

URLs to example spec.yml files are broken

Referring to bullet point 3 in add_model_instructions.md- We should point to files on the upstream repo and not the forked repo.

Docker versus Singularity

Personal observations/experiences (will be haphazard):

What exactly is the end system (os, architecture. and others) we are designing no-brainer tools for? cluster or personal computers? linux/macos/windows or all of them?
This question arose because of some personal pain points experienced this week. The models already in the zoo were all built using pip install inside the dockerfile. However, @WilliamAshbee built his environment using conda, which means, we have to find ways to activate the environment as soon as we enter the container. While this has been easier in docker, not so much in singularity. And that too for a sif built from a docker image.
Going from docker to singularity is very very time-consuming. Docker is not available on HPC and macOS. Singularity needs VMs to work on windows and macOS. Using Github actions to build docker images is a good idea (thanks to @djarecka) but not so much for debugging because of the lack of caching. Waiting to hear from @dhritimandas about caching on nix/circleCI.

Unable to run issues-helper@v3 in the workflow

actions-cool/issues-helper@v3 is not allowed to be used in neuronets/trained-models. Actions in this workflow must be: within a repository owned by neuronets.

See https://github.com/neuronets/trained-models/actions/runs/6612609799

@hvgazula Check if this has to do with GH_TOKEN or anything else. #85

Updated Docs

This will update instructions and templates for adding a new model to the repo.

Source URL for Model Weights & Sample Dataset

Hello,

I have added a model to the upcoming issue-form workflow to retrieve the file extension and use it to download the weights and sample dataset correctly into a local file with datalad download-url.

The step for the above-mentioned is the following: https://github.com/gaiborjosue/trained-models-fork/actions/runs/6422584380/workflow#L332

The Python script used to retrieve the information is the following: https://github.com/gaiborjosue/trained-models-fork/blob/master/.github/workflows/getFileExtension.py

The reason for adding such a thing to the workflow is that users can submit their weights in multiple formats (.h5, .pth, etc.) and sample datasets (.nii, .nii.gz, etc.), and there is currently no way for us to know that extension.

Therefore, the above-attached python script that uses requests to extract info from the metadata of the URL currently only works with google direct download URLs. GitHub raw URLs don't provide that information in the metadata; the same thing happens to OneDrive download URLs.

My question is the following: Is there a better way to extract metadata from the URLs so that it does not matter the source of the URL, will still give back the file extension? Or should we ask users only to use google drive to submit the weights and sample dataset URLs?

Thanks!

two files cannot be retrieved

in the nobrainer dockerfiles this is executed: https://github.com/neuronets/nobrainer/blob/4059376723379f19adc49ea1d67e6ebaacdd3fea/docker/cpu.Dockerfile#L14

but results in not being able to fetch these two objects:

#12 77.77 get(error): DDIG/SynthStrip/1.0.0/weights/synthstrip.1.pt (file) [not available; (Note that these git remotes have annex-ignore set: origin)]
#12 77.77 get(error): neuronets/kwyk/0.4.1/bwn_multi/weights/saved_model.pb (file) ['MD5E-s195234--50d0e6fbeb3d84be1dbfd241726ee2e8.pb'
#12 77.77 'MD5E-s195234--50d0e6fbeb3d84be1dbfd241726ee2e8.pb'
#12 77.77 'MD5E-s195234--50d0e6fbeb3d84be1dbfd241726ee2e8.pb']

i think there should be a test in this repo that is similar.

add knowing what you know model instructions

currently missing info on how to run the model.

datalad push fails to find storage targets in the add/update model workflow

trained-models/.github/workflows/new_model.yml

Lines 44 to 47 in 8cdfa99

 # Checkout the repository to the GitHub Actions runner 🟢 

 - uses: actions/checkout@v4 

 with: 

 ref: ${{ steps.branch.outputs.branchName }}

fix: checkout the repo with a fetch-depth of 0. For more info, please refer to https://github.com/actions/checkout/blob/main/README.md

update readme and add tutorials

users should be able to run the docker/singularity-ized forms of these models. currently the readme does not cover how to do this, nor a pointer to the docker container
users have to navigate to the release to understand how to load and do transfer learning. this should become a collab tutorial
there should also be a tutorial inference example with the pre-trained model + nibabel (i.e. the docker code).

Revamp exisiting github workflows

add_model.yml and get_model_data.yml don't seem to be working as expected. Probably we should rewrite them or think of cleaner/better ways to integrate them in the workflow.

use model cards for specs of models

https://arxiv.org/pdf/1810.03993.pdf

Disable running action everytime an issue is commented/replied

change_label.yml runs every time a comment or reply is added to an issue. We should constrain this behavior to only issues that are specific to adding models and not all issues. Creating labels is one way to minimize unnecessary runs.

For example, we should change

trained-models/.github/workflows/change_label.yml

Lines 4 to 6 in b49f52d

 issue_comment: 

 types: 

 - created

to on: issues types: labeled and put the onus of labeling/commenting on the user post-failure.

Add UniverSeg

TODO: I will be adding the UniverSeg model to this repo within the DDIG folder.

Project webpage: https://universeg.csail.mit.edu
Paper: http://arxiv.org/abs/2304.06131
Model weights: https://github.com/JJGO/UniverSeg/releases/tag/weights

add registration model

https://github.com/balbasty/nitorch

@balbasty - posting this here. @Hoda1394 can guide as to what is needed.

cleanup singularity test command in new_model.yml and update_model.yml

Note that the path bindings (--bind) in the singularity calls in

trained-models/.github/workflows/new_model.yml

Line 519 in 8b4d342

  singularity exec --nv --bind /actions-runner,/actions-runner/_work/,/actions-runner/_work/${{ github.event.repository.name }}/${{ github.event.repository.name }}/${{ needs.create_new_branch.outputs.MODELPATH }}/,/actions-runner/_work/${{ github.event.repository.name }}/${{ github.event.repository.name }}/${{ needs.create_new_branch.outputs.MODELPATH }}/docker,/actions-runner/_work/${{ github.event.repository.name }}/${{ github.event.repository.name }}/${{ needs.create_new_branch.outputs.MODELPATH }}/weights,/actions-runner/_work/${{ github.event.repository.name }}/${{ github.event.repository.name }}/${{ needs.create_new_branch.outputs.MODELPATH }}/example-data:/output ./${{ needs.create_new_branch.outputs.MODELPATH }}/docker/${{ needs.build-docker.outputs.MODELNAME }}.sif ${{ needs.create_new_branch.outputs.TESTCOMMAND }} 

and

trained-models/.github/workflows/update_model.yml

Line 449 in 8b4d342

  singularity exec --nv --bind /actions-runner,/actions-runner/_work/,/actions-runner/_work/${{ github.event.repository.name }}/${{ github.event.repository.name }}/${{ needs.create_new_branch.outputs.MODELPATH }}/,/actions-runner/_work/${{ github.event.repository.name }}/${{ github.event.repository.name }}/${{ needs.create_new_branch.outputs.MODELPATH }}/docker,/actions-runner/_work/${{ github.event.repository.name }}/${{ github.event.repository.name }}/${{ needs.create_new_branch.outputs.MODELPATH }}/weights,/actions-runner/_work/${{ github.event.repository.name }}/${{ github.event.repository.name }}/${{ needs.create_new_branch.outputs.MODELPATH }}/example-data:/output ./${{ needs.create_new_branch.outputs.MODELPATH }}/docker/${{ needs.build-docker.outputs.MODELNAME }}.sif ${{ needs.create_new_branch.outputs.TESTCOMMAND }} 

are redundant. The destination /output is never used in the test command. See below for why this works-

For more info, please refer to the outputs of the successful workflow runs on a simplified version of new_model.yml.

Add Vox2Cortex

Please refer to https://github.com/neuroneural/Vox2Cortex_fork for all the tools needed to integrate this model.

Move fields in spec.yaml to model card

trained-models/DeepCSR/deepcsr/1.0/spec.yaml

Lines 22 to 41 in e6d42db

 #### model information and help 

 model: 

 model_name: "deepcsr" 

 description: "3D deep learning framework for cortical surface reconstruction from MRI" 

 structure: "Meshnet" 

 training_mode: "supervised" 

 model_url: "https://github.com/neuroneural/DeepCSR-fork/" 

 Zoo_function: "predict" 

 example: "nobrainer-zoo predict -m DeepCSR/deepcsr/1.0 <path_to_input>" 

 note: "The output will be the stl files and will be saved in the config file's specified directory. For the input config file examples please see the /configs/predict.yaml file inside the model's repo." 

 input_file_type: "yaml" 

 model_details: "" 

 intended_use: "" 

 factors: "" 

 metrics: "" 

 eval_data: "" 

 training_data: "" 

 quant_analyses: "" 

 ethical_considerations: "" 

 caveats_recs: ""

@gaiborjosue We should remove some of these fields as they will be repeated in the model card.

Add the env variables and secrets related to Ec2 with appropriate permissions.

          Hello @hvgazula, nice. Now we need to add the env variables and secrets. This should be related to Ec2 and a github token with issue and workflow permissions.

Originally posted by @gaiborjosue in #83 (comment)

Mismatch between hardware referenced in docker files and hardware we have at out disposal

There is a mismatch between the hardware supported by the cuda versions referenced in the docker files and the ones we have at our disposal (for testing purposes). Requesting the said resources (which are outside gablab reserved resources) is taking a lot of time.

Updating the version of cuda and torch/tensorflow in the docker file. This can have unintended consequences.
Dandi hub could be an option but the space assigned is very small.
Moving this testing to AWS seems the most likely option.

Building a zoo that can hold different models from different environments means we should have the requisite range of hardware (gpu cards) to support such diverse environments. However, the range of cards we have is limiting and thus impacts our ability to test all models. Am I missing any other simple solution to overcome this problem?

No space left on device while creating sif image

Hello,

While building a singularity image from the docker hub I get this error:

The command I used was: singularity build pialnn.sif docker://edwardjosue2005/pialnn

@hvgazula tested building the image on his end, and it did work:

I already changed singularity's cache directory to om2.

Path to data in test command should be absolute and relative

@gaiborjosue quick question: Looking at the add model
instructions, you said the path to data should be relative to the repo structure. But at inference time, in the zoo cmd-line, the data can be anywhere on the system. In that case, relative path will raise an error. Won't it?

https://github.com/neuronets/trained-models/blob/master/add_model_instructions.md#8-test-command

Add a "Test-command" input text field to the form

Hello,

I am considering adding a field "Test-command" in the upcoming issue-form workflow to add a new model to this repo.

This is because to test the Python scripts, we need to know how to run those scripts. If we use spec.yaml for that, it contains placeholders for the path, such as "{infile[0]}"

So, I suggest adding a new input text field so that users input how to run their predict.py file (it can be called whatever name) without any flags containing the paths.

For example, the user would input:
python predict.py --length 15 --height 50

This will enable us to manually add the paths to the weights and sample dataset while testing, following our directory structure as suggested by @hvgazula.

For example, we will addon the following paths:
python predict.py --length 15 --height 50 --weights ./Model/best_model.pth --sampleDataset ./Model/test.nii.gz

This way, the user only gives the barebone test command to the script, and we worry about adding paths automatically.

Thanks.

the steps failed and success are duplicated except for the value and message in new_model.yml

Maybe we should replace it with a separate action.yml which is then used by new_model.yml? @gaiborjosue What do you think?

Add a model and Update a model through Issue Forms

This will enable users to add their model by only entering URLs for their folders/files, and the workflow will add their model to this repository.

If the user adds a model and the action fails, it will assign the issue a label "failed," and the user can update the URLs and change the label to rerun the action. Also, if the user already has a model in the zoo and wants to update it following the recommendations to the user specified in the form, there is a second issue that will update specific files/folders of the model.

The workflow will also open a Pull request, create a branch, and link it to the opened issue. When merged, the issue will close.

Pending tasks for the workflow actions/steps:
Update Model Workflow:

Create a new folder with the new version within the model's directory.
Clone folders/files that are not being updated from the immediate past version
Test model's Python scripts with EC2 setup.
Finish testing steps
Add how to update a model (client side) to the readme.

Add Model Workflow:

Test model's Python scripts with EC2 setup.
Remove old workflow for adding a model
Add how to use this workflow (client side) to the README.

These updates can be seen in my forked repository: https://github.com/gaiborjosue/trained-models-fork/tree/master/.github

Also, these updates will overwrite/disable the previous workflow of adding a model directly through Pull Request #67 since adding/updating through an issue form is an enhancement to the pipeline.

removing files from datalad/annex does not remove them from osf

datalad/datalad-osf#186

I reckon this could be important because, the workflow will accumulate unnecessary files even in the case of failure while adding models.

Conda environment activation at startup - Singularity

Hello @satra and @hvgazula, I hope you are doing well.

Following your suggestion of activating environments at startup in a singularity image, I can't find a working solution. I tried neuro docker's recipe file generation, but when building the singularity image, I got the following:

FATAL: You must be the root user, however you can use --remote or --fakeroot to build from a Singularity recipe file

Also, referring to this issue ReproNim/neurodocker#354 and ReproNim/neurodocker#346 the general conclusion seems to be to install everything in base.

In our case scenario, we are not building a singularity image from scratch; it simply builds a sif image from a docker image available at the hub.

Your help and guidance on this issue would be greatly appreciated.

Thank you!

Modifications to workflow to accommodate models to work with directories instead of files

Pretty much all surface reconstruction models from William work with multiple files at once. I am on the lookout for models that work with only one file at inference time. This is to enable testing out the workflow.

Add PialNN

This is the model that needs to be integrated https://github.com/neuroneural/PialNN_fork

Inconsistent actions/checkout version in new_model.yml

Please confirm the final version to be used throughout the workflow or should they be different because they are serving different purposes.

Issue with tar file in while building docker image for DeepCSR

The docker image build and push workflow fails at tar -xf CBSI.tar.gz.

Transfer learning from braingen model

Is it possible to use the braingen model for transfer learning for a segmentation problem, similar to how brainy was used for transfer learning in the AMS paper?

Add CortexODE

https://github.com/neuroneural/CortexODE_fork

Push docker image to hub with tag

trained-models/.github/workflows/new_model.yml

Line 553 in 8b4d342

docker push ${{ needs.build-docker.outputs.IMAGENAME }}

In the following line, docker push ${{ needs.build-docker.outputs.IMAGENAME }} be replaced with docker push ${{ needs.build-docker.outputs.IMAGENAME }}:${{ steps.modelVersion.outputs.model_version }} otherwise the default image gets pushed to the hub. For example, see below. I have a test image and tagged it as aws. Docker created a new image instead of renaming the image meaning when you run docker push test it is equivalent to docker push test:latest and not docker push test:aws.

Error downloading files with a google drive URL

Directly downloading models from Google Drive URLs is not always possible. For example, if the file size is too big, we get the following message (meaning the html will be downloaded instead of the actual file):

This is not a problem for small files. However, a work around is to extract the direct download url from the javascript elements of that page. This is a hassle.

add meningioma model

add the meningioma model once it has been improved. median dice score is currently 0.85. the model struggles mostly with small tumors and tumors on the borders of the non-overlapping cubes used for prediction.

Datalad Save Error

When running the datalad save -m “Added model X” to add a new model to trained_models as specified in: https://github.com/gaiborjosue/trained-models/blob/master/add_model_instructions.md I keep getting this error:

datalad.runner.exception.CommandError: CommandError: 'git --git-dir=/dev/null config -z -l --show-origin' failed with exitcode 129 [err: 'error: unknown option show-origin'`

I already checked I have the latest git, git annex, and datalad versions.

Add TopoFit

The saved model and predict.py scripts can be found at https://github.com/neuroneural/topofit_fork.

Create a utility script for model card

The utility script should have a template (yml/config/dict) for the model card which can be used in 2 ways. One for displaying during the PR creation (for merging model) and the other for converting the values entered in this template for creating the model_card.md

Add CorticalFlow++

fix(Singularity&Datalad): Fixed singularity command and datalad installation on EC2

Add CorticalFlow

Please refer to https://github.com/neuroneural/corticalflow_fork for all the tools needed to add this model.

Testing functionality for uploaded models

Discussed in #61

^{Originally posted by hvgazula July 20, 2023}
How do we ensure the integrity of the train/predict/{any other}.py scripts uploaded by the users? Are the scripts really doing their job? Do we test this in trained-models or nobrainer-zoo?

@satra @gaiborjosue

git clone https://github.com/neuronets/nobrainer-models
cd nobrainer-models
get-models ???

i tried using git-annex in the branch https://github.com/neuronets/nobrainer-models/tree/add/gitannex but when i clone the repository, the location of https://github.com/neuronets/nobrainer-models/blob/add/gitannex/sig/ams/meningioma_T1wc_128iso_v1.h5 cannot be found. it is available online at https://dl.dropbox.com/s/whbeot2wriab9v2/meningioma_T1wc_128iso_v1.h5.

when i run git-annex whereis on my laptop, i get the correct remote:

git-annex whereis
whereis sig/ams/meningioma_T1wc_128iso_v1.h5 (2 copies) 
  	00000000-0000-0000-0000-000000000001 -- web
   	af2fc714-62b2-48ea-9a90-b62db6ff2aa7 -- jakub@dash:/code/nobrainer-models [here]

  web: https://dl.dropbox.com/s/whbeot2wriab9v2/meningioma_T1wc_128iso_v1.h5
ok

@yarikoptic is there any chance you can help me out with this? how can i clone this repository and have git-annex know the correct remote urls?

add HA-GAN model

https://github.com/batmanlab/HA-GAN

	# Checkout the repository to the GitHub Actions runner 🟢
	- uses: actions/checkout@v4
	with:
	ref: ${{ steps.branch.outputs.branchName }}

	#### model information and help
	model:
	model_name: "deepcsr"
	description: "3D deep learning framework for cortical surface reconstruction from MRI"
	structure: "Meshnet"
	training_mode: "supervised"
	model_url: "https://github.com/neuroneural/DeepCSR-fork/"
	Zoo_function: "predict"
	example: "nobrainer-zoo predict -m DeepCSR/deepcsr/1.0 <path_to_input>"
	note: "The output will be the stl files and will be saved in the config file's specified directory. For the input config file examples please see the /configs/predict.yaml file inside the model's repo."
	input_file_type: "yaml"
	model_details: ""
	intended_use: ""
	factors: ""
	metrics: ""
	eval_data: ""
	training_data: ""
	quant_analyses: ""
	ethical_considerations: ""
	caveats_recs: ""

neuronets / trained-models Goto Github PK

trained-models's People

Stargazers

Watchers

Forkers

trained-models's Issues

Discussed in #61

Recommend Projects

Recommend Topics

Recommend Org

Jobs