christear / apaiq_release Goto Github PK
View Code? Open in Web Editor NEWreleased version of APAIQ
License: MIT License
released version of APAIQ
License: MIT License
pre-trained model for human hg38 you provided : https://drive.google.com/drive/folders/1KpT6Ajm5qqsGm_G-HMdwKBtmrDcDsi0E
pre-trained regression_model for human hg38 you provided : https://drive.google.com/drive/folders/1KkI4cc04hSb-foVjNcS253kRyj8EYaLL
Which file is for --model ?
Previously, I used model, but not regression_model, for my human datasets.
pre-trained model and annotation db_file you provided https://drive.google.com/drive/folders/1KNj-dsh5hCmKI3dyhIsi_OuU6-3mLpBW
They are prepared for Human Genome hg38 ? Right?
Thanks.
"python APAIQ/regression/evaluateRegression.v.2.py -h" showed:
usage: evaluateRegression.v.2.py [-h] --model MODEL --factor_path FACTOR_PATH
[--input_file INPUT_FILE]
[--input_plus INPUT_PLUS]
[--input_minus INPUT_MINUS]
[--pas_file PAS_FILE] [--out OUT]
[--threshold THRESHOLD] [--depth DEPTH]
[--window WINDOW] [--genome GENOME]
Evaluate each locus with RNAseq coverage exceed threshold and return prediction score.
optional arguments:
-h, --help show this help message and exit
--model MODEL the model weights file
--factor_path FACTOR_PATH normalization file path
--input_file INPUT_FILE unstranded bedGraph file
--input_plus INPUT_PLUS plus strand bedGraph file
--input_minus INPUT_MINUS minus strand bedGraph file
--pas_file PAS_FILE pAs location file to be predicted expression level
--out OUT output file path
--threshold THRESHOLD peak length lower than threshold will be fiter out
--depth DEPTH total number of mapped reads( in millions)
--window WINDOW input length
--genome GENOME assembly name of the genome. i.e. hg19, hg38, mm10
How to download the polyA database file?
Thanks.
I have been installing this APAIQ software for two days and am so tired. After I installed it successfully, it came out with this "concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.". The command is as follows:
python APAIQ.v.1.2.py \ --input_file=RNAseq.depth.bedGraph \ --out_dir=${output} \ --fa_file=${ref} \ --name=${group[0]} \ --model snu398_model.ckpt \ --t 30
who can solve it? I appreciate him/her sincerely! Thanks! Thanks! Thanks!
You only provided them for human hg38 #5
Thanks.
Hello developer of Apaiq,
I tried to run APAIQ using human RNA-seq data as follows:
source ~/miniconda3/etc/profile.d/conda.sh
conda activate apaiq_env
set -euo pipefail
in=results/010.bedtools.bedgraph.out
bedgraph=(`ls $in/SRR15734*.bedgraph`)
sample=(`ls $in/SRR15734*.bedgraph | perl -p -e 's{.+/(.+?).bedgraph}{$1}'`)
genome_fasta=~/mccb-umw/shared/genome/human_gencode_v34/GRCh38.primary_assembly.genome.fa
depth=(140 143 147 139)
model=bin/APAIQ_release/model/snu398_model.ckpt
out=results/027.APAIQ.out/${sample[$i]}
mkdir -p $out
apaiq --out_dir $out \
--input_file ${bedgraph[$i]} \
--fa_file $genome_fasta \
--model $model \
--t 8 \
--depth ${depth[$i]} \
--name ${sample[$i]}
But I got lots of warnings and error message:
**WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.decay
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.momentum
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-0.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-0.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-1.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-1.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-2.gamma
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-2.beta
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-3.gamma
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-3.beta
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-4.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-4.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-5.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-5.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-6.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-6.bias
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.
Traceback (most recent call last):
File "/home/haibo.liu-umw/miniconda3/envs/apaiq_env/bin/apaiq", line 33, in
sys.exit(load_entry_point('apaiq==1.0.3', 'console_scripts', 'apaiq')())
File "/home/haibo.liu-umw/miniconda3/envs/apaiq_env/lib/python3.7/site-packages/apaiq-1.0.3-py3.7.egg/apaiq/APAIQ_v1.py", line 163, in main
File "/home/haibo.liu-umw/miniconda3/envs/apaiq_env/lib/python3.7/site-packages/apaiq-1.0.3-py3.7.egg/apaiq/postScan.py", line 83, in annotatePAS
NameError: name 'sub_pas' is not defined
**
I think the error occurred in this line: https://github.com/christear/APAIQ_release/blob/main/src_v2/postScan.py#L83. Could you take a look.
Thank you,
Haibo
Could you please give more details for input and output?
thanks.
I tried to identify polyA sites of 12 RNA-seq samples using the apaiq command, but 5 of them failed to detect. These unsuccessful samples raised an error "Feature (2L:22538087-23513798) beyond the length of 2L size (23513712 bp)". The command is as follows: ${apaiq} \ --input_minus=Control_1_3Signal.Unique.str1.out.bg \ --input_plus=Control_1_3Signal.Unique.str2.out.bg \ --out_dir=${temp} \ --fa_file=${ref} \ --name=Control_1_3 \ --model ${model}/snu398_model.ckpt \ --t 10
Then, i checked the range of the bedGraph files (Control_1_3Signal.Unique.str1.out.bg and Control_1_3Signal.Unique.str2.out.bg) generated by STAR with the option '--outWigType bedGraph --outWigNorm RPM'. There is no Feature (2L:22538087-23513798) in the bedGraph files, and the range of all features in in the bedGraph files dot beyond the length of 2L size (23513712 bp). I was really confused with what happen. Can author tell me? Thank you very much.
How many raw reads are required to support a reliable PAS? Which parameter?
We should discard it, if it is supported by only several reads (number of raw reads, but not the score of the PAS).
The input is bedgraph, but not BAM. So, we cannot know the number of raw reads?
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.