The biobankread-bash from saphir746

Scaling, hierarchical tree parsing, general questions

Hi there! Thanks so much for the awesome package. I was in the process of writing my own phenotype parser when I found out about this, it saved me a lot of time and also provided guidance for the use cases that are specific to me.

I have some questions about the package:

I was wondering if you can comment on the scaling ability of the package? I see that the package mostly uses numpy and pandas, which I assume loads all the data into memory. Will this be a problem when the dataframe queried is very big (a large number of phenotypes at a time), or when the UKB add more phenotype and more people? Are there any cases where you see the package takes performance hits or results in out of memory error?
Is there currently a functionality that, for a hierarchical categorical attribute, grabs all the levels below a specific attribute? For example, if I put White for ethnic background, it would give all people who are either "White", "British", "Irish", and "Other white background"?
Do you have a way of saving the newly-created, complex, phenotype definitions and/or filters for later quick reference/reproducibility?
I see that you parse the html file for the field-related information. Aside from the html being UKB data access application specific, is there a reason why the data dictionary csv was not used? I'm currently using it and wonder if you avoided it because of a specific reason.
Are you currently working on adding to the documentation and use cases? I'd be more than happy to document and write up my use of the package as part of the example for other people to use.

Thanks!

installation instructions, release, pip package

Hello!
Looks great, you may want to provide instructions for installation, make it pip install-able and add a release in GitHub?
It already has a setup.py so might be that the instructions are incomplete at the moment?
Best,
Antonio

ImportError: UKBr could not be loaded properly

Hi! thank you so much for sharing! this is super helpful !

I am very new to git and command line. I have tried to use BiobankRead, but I kept on getting this error message:

Traceback (most recent call last):
  File "extract_variables.py", line 406, in <module>
    raise ImportError('UKBr could not be loaded properly')
ImportError: UKBr could not be loaded properly

it will be great if you can help me with it! Appreciated!
Hannah

saphir746 / biobankread-bash Goto Github PK

biobankread-bash's People

Contributors

Stargazers

Watchers

Forkers

biobankread-bash's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs