wtsi-hgi / automated-enhancer-gene-scrambler Goto Github PK
View Code? Open in Web Editor NEWA tool to help automate the Genome Scramble project
License: MIT License
A tool to help automate the Genome Scramble project
License: MIT License
You need an elif condition on the apply_hard_filter() , saying elif minimax == "min" , and then the else should be reserved for anything else, and should raise an error accordingly. You can imagine a situation in which neither min nor max gets passed as an argument, and the data gets erroneously manipulated by your else condition. This may not announce itself as an error until much later on - or it may not even announce itself at all. Which is of course bad, and needs accounting for.
Don't think it works as thought
This line is too convoluted, it should be broken down into separate distinct steps
(This is a super pedantic point, I learnt it myself 5 mins ago, but) the best (i.e. fastest/safest/most robust) way of checking if x is true or false is by writing if x
and if not x
, respectively. So, in your case, this should be rewritten to if not di.INTERFERRING_GENE_OVERLAPS
, and all other instances of not evaluating true/false conditions in this way should be replaced.
Source: https://docs.python.org/2/library/stdtypes.html#truth-value-testing
Source: https://stackoverflow.com/questions/37103705/what-is-the-correct-way-to-check-for-false
This function will currently just return genes
if none of your if/elif conditions are satisfied, is this deliberate? If not, an else condition needs to be included.
Github tracks everything, so by leaving stuff commented on the unlikely chance that you may want to come back to it, you are just cluttering the program a bit unnecessarily.
There is too much going on in this line
So that it doesn't create the intermediary file
Not sure what 'pr' means in the variable names?
For example, you know what pyranges does and what it returns, so line 67 genes_pr = pr.PyRanges(gene_data)
makes sense to you. But it does not to me; if the variable(s) was named differently it should hopefully make more sense!
Here you need to raise an exception, like raise Exception("Error:...")
, instead of just printing
Function clean_regulatory_elements on line 255 of data_initilisation.py, can be generalised.
It's no longer being used, so it should be deleted, and the comment in main.py should be deleted also.
Some of the text is incomplete, and is in some cases completely wrong, on the Presentation.ipynb.
Not sure if I am reading this right? You would like to iterate though the dataframe, a set number of times, so you use head()
to count from the top of the dataframe, and you use ENHANCER_CONVOLUTION to define how many rows from the top you would like to include? If so, then you should rename ENHANCER_CONVOLUTION to something else - as this name does not signify what it is being used for.
I'm not sure that this function, which creates graphs, should be called in the middle of calculate_interest_score()
Functions should flow chronologically: in the order that they are called within the program. In this way, the user should be able to read the code from top to bottom, smoothly.
Things that are constants such as these should be declared at the top of the program, see https://www.freecodecamp.org/news/clean-coding-for-beginners/ for more info.
This function does not to be kept in comments: github tracks everything, so if you really needed to (which you almost certainly won't), then you could get it back.
I'm not sure this function belongs in the find_metrics.py file
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.