berkeley-stat159 / project-epsilon Goto Github PK

View Code? Open in Web Editor NEW

0.0 7.0 6.0 283.62 MB

License: BSD 3-Clause "New" or "Revised" License

Makefile 1.69% TeX 30.48% Python 67.84%

project-epsilon's Issues

Remove the folder txt_files ?

Hello @mingujo

Can I remove the folder txt_files in the code folder ?
You should have evrything in the txt_output, right ?

Thanks

In-line comments

In at least some of your code, it appears you are using the docstring style comments for in-line commenting.

As we discussed in lab, in-line comments should be denoted with # instead of a string literal. Consider perusing your code and making these changes

I don't believe I specified an exact page length either. I suspect that many of you will have something like 12 to 15 pages including figures and references. However, if you've done substantial work you may need to use more pages to clearly explain your work.

You should strive to be succinct, but clear. You should focus on explaining what you did, rather than what the authors of the original paper did or on what the analysis of fMRI data involves in general terms. You should try organize the main body of your report into a clear narrative. This may mean that part of what you did this semester turned out to be ancillary to your central analysis. If this is the case, you may consider moving the ancillary material to an appendix, which you should include at the end of your paper.

Remember that the purpose of your report is not to convince me that you performed a lot of work this semester. Rather the purpose of the report is to explain what question (questions) your team ultimately pursued and how and to what extent that question was (those question were) addressed. I will look at your code and commit history to determine how much work you did. The paper should help me understand why you did what you did and what you learned doing it.

@soazig @lizhua @ye-zhi @mingujo

Resample filtered data + Mask + download

Hello @jarrodmillman @matthew-brett @rossbar

We have been exploring the filtered data for comparison across subjects.

Thank you for the provided template for the filtered data : https://nipy.bic.berkeley.edu/rcsds/mni_icbm152_nlin_asym_09c_2mm/resample_templates.py

In the script, it seems like they apply all of the mni_icbm* files on the data.
Do we have to do that or can we just use the 2 most interesting you mentioned :
to display pattern: mni_icbm152_t1_tal_nlin_asym_09c_2mm.nii
and the mask : mni_icbm152_t1_tal_nlin_asym_09c_mask_2mm.nii
My plan is to apply the mask first and then the template to visualize the data.
=> Do you think that is sufficient ?

For unfiltered data, I was using the technique shown in class to come up with a mask (with the histogram plots). Other groups are using a file in the anatomy folder of the data called 'inplane_brain.nii.gz", Our group have the data called "highres001_brain_mask.nii.gz".
The orther files are highres001.nii.gz, highres001_brain.nii.gz, inplane.nii.gz.
=> Do you think we can use one of this file to mask the data for each subject's run's?
=> I think, *_brain.nii.gz can be used to select the voxels in the brain. I am not sure about the difference between 'inplane' and 'highresr001', or even what the highres001_brain_mask.nii.gz represents.

Also, we have a discussion in our group on whether programming the download of the .zip files for the filtered data or download them one by one.
I checked the hashes for 2 files downloaded in the 2 different ways and they are the same.
We are just a bit suspicious because downloading the files take ~10s by file while downloading the .zip takes 5hours.
=> Do you have any thoughts on that ?

Thank you !
@mingujo @ye-zhi @lizhua @timothy1191xa

General question : Is it better to display the images while running the analysis ?

Hello @rossbar , @matthew-brett , @jarrodmillman

I was wondering if there is a best practice for dealing with figures, plots and results.
We are rendering lost of plots and text files but we only use a few of them in the report.
I was wondering if this would be better to display the pictures that are in the report while the analysis is run ?

Thank you !

requirements.txt

Make sure that your requirements.txt is up-to-date with the packages you are actually using in your code. I noticed there is at least one dependency (nibabel) in your code that is not included.

Can I merge the 2 branch master /revert-7-master ?

Hello @mingujo , @timothy1191xa , @ye-zhi , @lizhua .
Can I merge the 2 branch master /revert-7-master ?

Collaborative work

Hey @mingujo,

When 2 people are working on the code, that would be nice if this is reflected on the repo as well.
I pointed out this new function for events-neural_high from last Thursday lecture and started the code today in class with you - that should have been nice to see my contribution in the history as well. Remember that one of the criteria for this project is collaborative work.

Please remove your branch timothy-test from the common repo

Hello @timothy1191xa,

Can you remove your branch timothy-test from the common repo ?
This branch should be on your personal repo is you need it.
Thank you

Please see our progess report from the branch dev

Hello @jarrodmillman, @rossbar

We got difficulties with our branches master and dev.
If it is still possible to review it, please consider our progress report from the dev branch.
Thank you

GLM and T-test

@lizhua @ye-zhi @mingujo

So the glm and t_test script were only set for subject 1, 5, 11.
Don't we need all the betas for the multi comparison ?

The t_test_plot script has a lot of unneeded lines + looping through subject 1,5,11. don't you only use the subject 1 for he report ?

Function for plots should be fine

Hello @soazig @mingujo @lizhua @ye-zhi

Matthew replied -

Don't worry too much about the absolute value for the coverage. It's hard to do tests for the plots. We'll look at what code you haven't covered, and take into account if it was hard to cover.

So can I just merge it if the friggin Travis only shows that I would lower the coverage but no error?

Do not merge to Tests when test fails

Hello @ye-zhi

You should not have merged this since it failed the Travis.
Please keep the Test branch clean and revert this merge
Seems like this can be fixed easily :
IOError: [Errno 2] No such file or directory: "../../../data/ds005/sub011/BOLD/task001_run003/bold.nii"
see here https://travis-ci.org/berkeley-stat159/project-epsilon/jobs/96216714

If you test your test with nosetests, you can avoid this situation.
thanks

Working on new data - same hashes when downloaded one by one or .tar

Hello @mingujo , @ye-zhi

You told me you managed to work with the new data. As I said yesterday, when I do data = img.get_data() I got an error.
Can you put the link of the scripts you did that ?
I must do something wrong.

Also, I checked the hashes of the files you sent me. They are the same as the files I can download one by one. So I will keep the download data one by one since thus takes only 10s/data.
I was thinking I could just unzip them in the same ds005 directory as the original data. what do you think ?

Thanks!

make analysis doesn't work

Your targets for the analysis (all-analysis, and the individual analyses themselves) don't work. If you are planning on fixing this, please push the changes soon so we can evaluate the project. If not, apply the wontfix label and we will try to run your code by hand

Final report draft feedback

project-epsilon (note: on 'dev' branch)

For both your preprocessing and linear regression, you might want to include
images that show some of the things that you explain in the paragraphs
(picture is worth a thousand words etc etc). For instance, the design matrix
for your linear regression might be interesting, and a visual representation
of the behavior stimuli over time would also be illuminating.
Same goes for section 3.3 - rather than explain what you did in words, show
the QA plots that you reproduced, the correlation vectors, etc.
For figures 4 and 5 - you might try to move the colorbars to the right of all
the images so that they are visible. Better yet, if you want the readers to
be able to compare the images between figures 4 and 5, you can set the
colorscales to be the same. This is easy to do with the vmin and vmax kwargs
to imshow.
This may or may not be possible, but you might want to focus on specific
areas of the brain rather than showing slices throughout the depth of the
entire brain. For example, if you identify regions of high activity, you can
use a boolean mask to select only that ROI.
How does the linear drift analysis compare when you use the corrected data
that Matthew put on the website?
The normalization in figure 15 seems off - you say the first component
accounts for 98% of the variability but it only goes up to a value of .003
on the y-axis of the plot. You might consider a different normalization
(I'm assuming you used the normed=True keyword in the histogram function) so
that your conclusions in the text match the results you display.

Makefile in data - No need to unzip all the files

Hello @mingujo

As I said, we don't need to unzip all the files when we download them.
We definitely don't need to unzip the BOLD data.
I think we should remove the "unzip" part of the Makefile and see if the Travis can pass the test.

Bibliography reference rename

In your project.bib in the paper/ dir, you seem to have updated the citation information without changing the citation label (i.e. it still is under the tag lindquist2008statistical).

This would get very confusing in the actual paper as that tag (lindquist...) is how you refer to the reference in the report.tex. Therefore, the tag should match the actual paper you are referencing:

A standard (though not necessary) format for BibTex citation references is

feedback

Good, clear description of the loss-aversion paper.

Great work with the initial regressor analysis; I liked to see the results of
the preliminary regressor applied to the data - represents real, tangible
progress toward an ultimate goal.

Clear presentation of the goal to experiment with the HRF model - was less clear
about where the motivation to do so came from; might want to justify the
decision to investigate this aspect a bit more. For example, there is a published
literature addressing this question to some extent. You don't need to replicate
the state of the art here, but you should reference existing work as part of
your motivation.

Seemed to have a very solid understanding of what they knew about the data, and
what they still needed to understand (PTvals for instance). Also have clearly
done literature review outside of their paper to help with the coding aspect
(PEP8).

The group seems clearly capable and invested in the project, just might want to
focus on specific tasks a bit more.

You should make sure your code is valid. For instance,

$ python eda_behav.py
  File "eda_behav.py", line 3
    %matplotlib
    ^
SyntaxError: invalid syntax

You still have a lot of work to do, but there is still time.

Function for plots

Hello, @jarrodmillman @rossbar @matthew-brett

We have a function that can generate the plot. However, we don't really know how to write a test for this function, and it will just lower the coverage in Travis CI (we usually put codes involving plots in scripts so the coverage wouldn't be affected). I was wondering if we should just rewrite the code to generate such plot.

Thanks!

Template for dictionary to select filter or normal data

Hello @lizhua

Here is the file.

noise-modeling_script.txt

Can someone put 2 of the new images in the data file so I can work on it ?

Hello guys,

When I download the data one by one, I cannot use them for analysis.
Can someone add 2 images from the new data on the data folder so that I can tests my code ?
(my download of the .tar folder still need 5 hours)

Thanks
@mingujo @ye-zhi @timothy1191xa @lizhua

Betas

Hello @soazig @lizhua @mingujo @ye-zhi

In the paper, the betas are extracted from this specific coordinates - Montreal Neurological Institute
coordinates (x, y, z): 3.6, 6.3, 3.9.

In the supplementary paper, they have coordinates such as -
B ventral striatum (3.6, 6.3, 3.9),
L inferior/middle frontal (-48.5, 24.7, 17.0),
R inferior frontal (50.2, 14.3, 7.6),
R inferior parietal (47.9, -45.6, 49.4).

Avoid pushing to common repo - make it read only

Hello @min, @liz, @timothy1191xa , @ye-zhi

I remembered and figures how to avoid to push to the common repo when we push our local branches.
You can edit the url of the upstream to make it read-only :
git remote set-url upstream git://github.com/berkeley-stat159/project-epsilon.git

Then, create pull request on github directly between your personal remote branch (e.g. soazig-tests) and the common remote branch (e.g. tests).

Thank you
Soazig

Travis - please create your test function as you go

Hello @mingujo @timothy1191xa @lizhua @ye-zhi

So last night I was able to create a branch that passed the travis. The travis doesn't pass now because we are missing the tests for our functions.
We really need to have the Travis work so please hold on your new analysis or code and go back to create your tests for the functions and clean up the master branch a bit.
Following the project alpha repo, I will reorganize our master repo and create the tests folder in code.
Please add your tests today, I spent more than 2 hours trying to fix stuff it is better to do it asap before adding any complexity to the repo.

I am creating a new script to dwnl the preprocessed data. I want to add it in the Makefile for data. @min, if I am done before you change the Makefile, I will change it directly to remove the unzip and also add the testing data (group alpha are testing their code on the data provided in class ds114.

I will create a branch Tests to create all tests
A branch Data where I will add the dwl for the preprocessed data
A branch Travis to fix stuff relative to the piazza discussion

Please watch the Test branch and push your tests asap.

Thank you
Soazig

Question regarding the linear regression on a voxel

We generated new convolution data into txt files using the method "Using onsets that do not start on a TR" provided on Day 25. We tried to use them as beta predictors but since it has too many rows (2400) (dividing TR into 10 intervals), it does not fit with our image data (240 time steps).

How should we decrease the number of convolution data so we can do the linear regression on a voxel of an image data?
If we cannot decrease, is there any method for this issue?
@jarrodmillman

Coverage went down when someone added stimuli in the functions folder w/o tests

Hello guys,
@timothy1191xa @mingujo @lizhua @ye-zhi

Can whoever added the stimuli function can add tests for it to see if we can get back to 100%?
Thanks !

How do we create a test for the download of the filtered data ?

Hello @jarrodmillman @matthew-brett @rossbar

I created a python script that generates a bash script to be used in the Makefile to be able to download the filtered data and located them in the relevant directories.
I understand how we used this checksum txt (http://openfmri.s3.amazonaws.com/tarballs/ds005_raw_checksums.txt)
to test our data from open fMRI as we check the hashes.
How do I do the test without hashes ? Do I need to generate them ?

This will be helpful to create a test for the filtered data that we download. We also want to use the data from ds114 provided in class to test some functions. I could not find the relative hashes for that.

Thank you for your answers.
Soazig
@mingujo

No need to create txt files for drifts

Hello @lizhua

After reflection, I don't think we need to create a function or txt file for the drift because it is very simple. It is better to avoid unnecessary scripts - and I would not know how to test such functions anyway. You can find the drifts formula here: http://www.jarrodmillman.com/rcsds/lectures/modeling_noise.html

An example here:
convolved = np.loadtxt('ds114_sub009_t2r1_conv.txt')[4:]
X = np.ones((n_trs, 4))
X[:, 0] = convolved
linear_drift = np.linspace(-1, 1, n_trs)
X[:, 1] = linear_drift
quadratic_drift = linear_drift ** 2
quadratic_drift -= np.mean(quadratic_drift)
X[:, 2] = quadratic_drift

My part on pca will justify the need for the 2 regressors linear and quadratic drifts, you can just include them directly in your design matrix.

Required version of the package future ?

Hello @matthew-brett @jarrodmillman

So far, some of our build/coverage problem come from the compatibility between Python 2 and 3.
I was wondering if you recommend installing the package 'future'. In that case, I don't know which version I should put in the requirements ?
Now I am running version 0.15.2
Thank you,
Soazig

Make clean in data directory does not work

Hello @jarrodmillman ,

I wanted to make sure I was able to create progress report and clean the additional files after.
When I use make clean, the additional files are not removed from my directory.
However, when I use the command line rm -f *.{aux,log,bbl,lof,lot,blg,out} directly in bash, this works.
Maybe I am missing something, any idea on how to fix this bug ?
FYI make clean is working for the progress report pdf file in the slides directory.
Thank you !
Soazig

Presentation submission

Hello @jarrodmillman,

We run onto problems for the size of our images to be included in our presentation today
This is the latest version of the slides.

https://github.com/berkeley-stat159/project-epsilon/blob/master/slides/progress.pdf

We apologize for the late notice.
Thank you,
Soazig

berkeley-stat159 / project-epsilon Goto Github PK

project-epsilon's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs