GithubHelp home page GithubHelp logo

neuroscout / neuroscout-paper Goto Github PK

View Code? Open in Web Editor NEW
0.0 0.0 0.0 414.62 MB

Neuroscout paper analysis repository

Home Page: https://neuroscout.github.io/neuroscout-paper/

License: Other

Jupyter Notebook 99.62% Python 0.26% Dockerfile 0.10% Shell 0.02%

neuroscout-paper's Introduction

neuroscout

Build Status codecov DOI

This is the repository for the neuroscout server.

Requirements: Docker and docker-compose.

Configuration

First, set up the main environment variables in .env (see: .env.example). Set DATASET_DIR, KEY_DIR, and FILE_DATA to folders on the host machine.

Optionally, set up pliers API keys for feature extraction in .pliersenv (see: .pliersenv.example). More information on pliers API keys

Next, set up the Flask server's environment variables by modifying neuroscout/config/example_app.py and saving as neuroscout/config/app.py.

Finally, set up the frontend's env variables by modifying neuroscout/frontend/src/config.ts.example and saving as neuroscout/frontend/src/config.ts.

For single sign on using Google, a sign-in project is needed.

Initalizing backend

Build the containers and start services using the development configuration:

docker-compose -f docker-compose.yml -f docker-compose.dev.yml build
docker-compose -f docker-compose.yml -f docker-compose.dev.yml up -d

The server should now be running at http://localhost/

Next, initialize, migrate and upgrade the database migrations. If you have a database file, load it using pg_restore. Otherwise, delete the migrations folder, initalize the database and add a test user.

docker-compose exec neuroscout bash
rm -rf /migrations/migrations
python manage.py db init
python manage.py db migrate
python manage.py db upgrade
python manage.py add_user useremail password

Staging & production server

For the staging server, you can trigger a manual build as follows:

docker-compose -f docker-compose.yml -f docker-compose.build.yml build
docker-compose -f docker-compose.yml -f docker-compose.build.yml up -d

For the staging or production server, you can trigger pull a pre-built image from GHCR: First set the variable IMAGE_TAG to the apprioriate image tag

docker-compose -f docker-compose.yml -f docker-compose.image.yml build
docker-compose -f docker-compose.yml -f docker-compose.image.yml up -d

Setting up front end

The frontend dependencies are managed using yarn

Enter the neuroscout image, and install all the necessary libraries like so:

docker-compose exec neuroscout bash
cd frontend
yarn

You can then start a development server:

yarn start

Or make a production build:

yarn build

Ingesting datasets and extracting features

You can use manage.py commands to ingest data into the database. Run the following commands inside docker: docker-compose exec neuroscout bash

To add BIDS datasets

python manage.py add_task bids_directory_path task_name

For example for dataset ds009

python manage.py add_task /datasets/ds009 emotionalregulation

Finally, once having added a dataset to the database, you can extract features using pliers into the database as follows:

python manage.py extract_features bids_directory_path task_name graph_json

For example:

python manage.py extract_features /datasets/ds009 emotionalregulation graph.json

Even easier, is to use a preconfigured dataset config file, such as:

docker-compose exec neuroscout python manage.py ingest_from_json /neuroscout/config/ds009.json

Maintaining docker image and db

If you make a change to /neuroscout, you should be able to simply restart the server.

docker-compose restart neuroscout

If you need to upgrade the db after changing any models:

docker-compose exec neuroscout python manage.py db migrate
docker-compose exec neuroscout python manage.py db upgrade

To inspect the database using psql:

docker-compose run postgres psql -U postgres -h postgres

API

Once the server is up and running, you can access the API however you'd like.

The API is document using Swagger UI at:

http://localhost/swagger-ui

Authorization

To authorize API requests, we use JSON Web Tokens using Flask-JWT. Simply navigate to localhost:5000/auth and post the following

{
    "username": "[email protected]",
    "password": "string"
}

You will receive an authorization token in return, such as:

{
    "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZGVudGl0eSI6MSwiaWF0IjoxNDQ0OTE3NjQwLCJuYmYiOjE0NDQ5MTc2NDAsImV4cCI6MTQ0NDkxNzk0MH0.KPmI6WSjRjlpzecPvs3q_T3cJQvAgJvaQAPtk1abC_E"
}

You can then insert this token into the header to authorize API requests:

GET /protected HTTP/1.1
Authorization: JWT eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZGVudGl0eSI6MSwiaWF0IjoxNDQ0OTE3NjQwLCJuYmYiOjE0NDQ5MTc2NDAsImV4cCI6MTQ0NDkxNzk0MH0.KPmI6WSjRjlpzecPvs3q_T3cJQvAgJvaQAPtk1abC_E

Note that in order to use any protected routes, you must confirm the email on your account. Confusingly, you can get a valid token without confirming your account, but protected routes will not function until confirmation.

Running backend tests

To run tests, after starting services, create a test database:

docker-compose exec postgres psql -h postgres -U postgres -c "create database scout_test"

and execute:

docker-compose run -e "APP_SETTINGS=neuroscout.config.app.DockerTestConfig" --rm -w /neuroscout neuroscout python -m pytest neuroscout/tests

or run them interactively: docker.compose exec neuroscout bash APP_SETTINGS=neuroscout.config.app.DockerTestConfig python -m pytest neuroscout/tests/ --pdb

To run frontend tests run:

docker-compose run --rm -w /neuroscout/neuroscout/frontend neuroscout npm test

Running frontened tests

To run frontend tests, have Cypress 6.0 or greater installed locally. First, ensure neurscout is running:

docker-compose -f docker-compose.yml -f docker-compose.dev.yml up -d

Next, set up the test environment:

docker-compose exec neuroscout bash
export APP_SETTINGS=neuroscout.config.app.DockerTestConfig
bash setup_frontend_tests.sh

In a separate window, you can run cypress:

cd neuroscout/frontend cypress open

Once done, kill the first command, and run the following to tear down the test db

docker-compose exec -e APP_SETTINGS=neuroscout.config.app.DockerTestConfig neuroscout python manage.py teardown_test_db

neuroscout-paper's People

Contributors

adelavega avatar jdkent avatar peerherholz avatar rbroc avatar tyarkoni avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

neuroscout-paper's Issues

Final checklist

  • Export to Overleaf
  • Add Acknowledgments
  • Table 1: Add links
  • Fig 3 label is on dotted line
  • Add line numbers (perhaps make 2 version at last minute)
  • Consistent use of em dash (โ€”). It should be specifically that charachter and have no spaces around it.
  • The FaceNet methods section has several variables not presented (e.g. first_time_face)
  • Rebuild jupyter book and link

Citation related issues:

  • fMRIprep section needs references properly cited
  • Manually check all refs
  • Markiewickz citations look odd to. e.g.: C. Markiewicz et al., 2021; C. J. Markiewicz et al., 2021 - statsmodels and fitlins respsectively

shorten methods

remove feature descriptions for features we do not use (e.g., BERT or AudioSet)

re-run some audioset models

re-run some audioset meta-analyses w/ new datasets (e.g., speech, music, whistling?), possibly with thresholding

NV upload failing for A25Bv

Upload for A25Bv (frequency model for LTS) fails with following traceback (using neuroscout-upload):

Traceback (most recent call last):
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/pyns/api.py", line 114, in _make_request
    resp.raise_for_status()
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/requests/models.py", line 941, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 422 Client Error: UNPROCESSABLE ENTITY for url: https://neuroscout.org/api/analyses/A25Bv/upload

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/miniconda-latest/envs/neuro/bin/neuroscout", line 11, in <module>
    load_entry_point('neuroscout-cli', 'console_scripts', 'neuroscout')()
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/neuroscout_cli/cli.py", line 65, in main
    command(deepcopy(args)).run()
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/neuroscout_cli/commands/upload.py", line 8, in run
    return super().run(upload_only=True)
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/neuroscout_cli/commands/run.py", line 112, in run
    n_subjects=n_subjects)
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/pyns/models/analysis.py", line 277, in upload_neurovault
    n_subjects=n_subjects, collection_id=collection_id)
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/pyns/api.py", line 121, in _make_request
    raise requests.exceptions.HTTPError(error)
requests.exceptions.HTTPError: 422 Client Error: UNPROCESSABLE ENTITY for url: https://neuroscout.org/api/analyses/A25Bv/upload

Once this is solved, #6 can be merged.

Remove empty nodes from json collections

Datasets with no model (e.g. studyforrest for entropy models) are still included in the collections as empty nodes (studyforrest: {}) We should drop from all collections all nodes for datasets with no associated analysis.

inspect LM surprisal

  • Get surprisal metrics for different models and across transcript vs. force-aligned (= no punctuation)
  • Look at correlations between models
  • Qualitative inspection of examples
  • For now, only focus on window_size = 25

run single-predictor mel models

fit separate models with re-extracted mel features to reconstruct tonotopic maps (not necessarily relevant for the paper, but as preliminary result for ohbm submission and to kick-start some audio analyses).
Analysis should probably be set up as classification.

finalize results

especially:

  • shorten FFA and frequency paragraphs
  • add Lancaster norms discussion

fix double stat map (when re-uploaded w/ space- entity)

In some analyses that were re-run w/ afni, the original imagines without space- entity (typically run w/ nilearn) were not overwrittten.

Thus they were uploaded again to the new afni only collection.

Detect and correct these NV collections.

(re)run analyses for final version of paper

Single predictor models:

  • Should be all done

Frequency analyses:

  • We need to look into which models to report, but probably something incremental
  • Run on NNDB and Narratives
  • Check consistency in estimators and if inconsistent rerun

Shot-change:

  • Re-run on NNDB
  • Check consistency in estimators and if inconsistent rerun

Lancaster norms:

  • Rerun on NNDB and Narratives
  • Check consistency in estimators and if inconsistent rerun

FaceNet:

  • @adelavega extracts NNDB
  • Run NNDB
  • Check consistency in estimators and if inconsistent rerun

AudioSet:

  • Run music on NNDB and Narratives (need to be extracted)
  • Maybe explore couple of other features (selectively pick them from the ontology)
  • Check results and make decision on whether to keep them
  • If we keep them, check consistency in estimators and if inconsistent rerun

BERT:


Reading brain dataset:

  • Ignore for now, but maybe run frequency on it after everything else is done.

Include report plots and regressor plots

Once the set of models is final, we should re-run all notebooks so to include at least some sample reports and timeline/distribution plots for regressors of interest.

write up discussion/conclusions

Points maybe worth mentioning

  • Neuroscout makes multi-dataset reproducible workflows accessible
  • Makes using novel features easy
  • Not mutually exclusive w/ experimental research
  • Caveats in the interpretation of results (e.g., features are model dependent)
  • What next
    • More datasets and features
    • Dataset release
    • Enable browsing
    • Better integration with meta-analysis workflow
    • Support other models?

refine plotting utils

create utils for surface plots (at least for single-predictor models), we need more compact ways to visualize results

Dataset or feature specific issues / irregularities

This is just a record of some "quirks" that are dataset or feature specific. Feel free to edit and add to the list
Will work on making a more general version of this for public consumption

  • StudyForrest is in German and has no ingested speech transcript
  • Sherlock movie is present in the Sherock and and SherlockMerlin datasets, with different subjects. We're going to focus on the Sherlock dataset for the Sherlock task, and the Merlin task from the SherlockMerlin dataset.
  • SchematicNarrative has no "tokenized" BERT features because there are independent stimuli within each run.
    See: neuroscout/neuroscout#772
  • Life dataset has no faces due to the nature of the stimuli

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.