GithubHelp home page GithubHelp logo

breimanntools / aaanalysis Goto Github PK

View Code? Open in Web Editor NEW
28.0 6.0 1.0 485.22 MB

Python framework for interpretable protein prediction

Home Page: https://aaanalysis.readthedocs.io

License: BSD 3-Clause "New" or "Revised" License

Python 20.91% Jupyter Notebook 79.09%
explainability feature-engineering feature-selection machine-learning positive-unlabeled-learning protein-prediction intepretable-machine-learning intrepretability

aaanalysis's People

Contributors

breimanntools avatar freiherr5 avatar stephanbreimann avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

freiherr5

aaanalysis's Issues

load_scales fail to handle int64s

The provided demo code crashes, because the loaded scales contains int64 dtype columns, which your code can't handle.
The code crashes during CPP, the int64s are created in load_scales. If you convert the int64 to python ints (which should be able to fit the int64) it works. I'd recommend to either include this fix in your codebase, or handle int64s. Thanks

aa.options["verbose"] = False
df_scales = aa.load_scales()
df_scales = df_scales.astype({col: int for col in df_scales.select_dtypes(include='int64').columns})
non_numeric_columns = df_scales.select_dtypes(exclude=[np.number]).columns.tolist()
dict_dtype = dict(df_scales.dtypes)
non_numeric_columns2 = [(col, dict_dtype[col]) for col in dict_dtype if dict_dtype[col] not in [np.number, int, float]]
df_seq = aa.load_dataset(name="DOM_GSEC", n=50)

DOM_GSEC.tsv

Documentation regarding dataset.

Hello, I was wondering whether you could provide some information regarding the DOM_GSEC dataset.

Your .tsv file contains the following entry:

Q14802 MQKVTLGLLVFLAGFPVLDANDLEDKNSPFYYDWHSLQVGGLICAGVLCAMGIIIVMSAKCKCKFGQKSGHHPGETPPLITPGSAQS 0 37 59 NSPFYYDWHS LQVGGLICAGVLCAMGIIIVMSA KCKCKFGQKS

or as an object:

entry Q14802
sequence MQKVTLGLLVFLAGFPVLDANDLEDKNSPFYYDWHSLQVGGLICAG...
label 0
tmd_start 37
tmd_stop 59
jmd_n NSPFYYDWHS
tmd LQVGGLICAGVLCAMGIIIVMSA
jmd_c KCKCKFGQKS

the problem is that if you look into uniprot,

extracelluar domain (I believe this is jmd_n) is: NDLEDKNSPFYYDWHSLQ
transmembrane domain is: VGGLICAGVLCAMGIIIVMSA
cytoplasmic domain is (I believe this to be jmd_c): KCKCKFGQKSGHHPGETPPLITPGSAQS

and I can't resolve the difference.
I mean, the transmembrane domain seems to correspond to the entry in uniprot, the jmd_n terminal seems to be a random substring of the extracelluar domain, and the jmd_c seems to be only the first third of the cytoplasmic domain. Could you elaborate why? Thanks!

Readthedocs building crash: KeyError: 'refid

Check all references and try to get same error locally as in readthedocs page.
Error message
"KeyError: 'refid'
Exception occurred:
File "/home/docs/checkouts/readthedocs.org/user_builds/aaanalysis/envs/latest/lib/python3.9/site-packages/docutils/nodes.py", line 652, in getitem
return self.attributes[key]
KeyError: 'refid'"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.