christear / apaiq_release Goto Github PK

released version of APAIQ

License: MIT License

Python 100.00%

apaiq_release's Issues

what's the difference between model and regression_model ?

pre-trained model for human hg38 you provided : https://drive.google.com/drive/folders/1KpT6Ajm5qqsGm_G-HMdwKBtmrDcDsi0E

pre-trained regression_model for human hg38 you provided : https://drive.google.com/drive/folders/1KkI4cc04hSb-foVjNcS253kRyj8EYaLL

Which file is for --model ?
Previously, I used model, but not regression_model, for my human datasets.

pre-trained model and annotation db_file for human hg38?

pre-trained model and annotation db_file you provided https://drive.google.com/drive/folders/1KNj-dsh5hCmKI3dyhIsi_OuU6-3mLpBW

They are prepared for Human Genome hg38 ? Right?

Thanks.

quantification of PAS usage by APAIQ

How to quantify?

Quantification is very important for me; it is my purpose.

"python APAIQ/regression/evaluateRegression.v.2.py -h" showed:

usage: evaluateRegression.v.2.py [-h] --model MODEL --factor_path FACTOR_PATH
[--input_file INPUT_FILE]
[--input_plus INPUT_PLUS]
[--input_minus INPUT_MINUS]
[--pas_file PAS_FILE] [--out OUT]
[--threshold THRESHOLD] [--depth DEPTH]
[--window WINDOW] [--genome GENOME]

Evaluate each locus with RNAseq coverage exceed threshold and return prediction score.

optional arguments:
-h, --help show this help message and exit
--model MODEL the model weights file
--factor_path FACTOR_PATH normalization file path
--input_file INPUT_FILE unstranded bedGraph file
--input_plus INPUT_PLUS plus strand bedGraph file
--input_minus INPUT_MINUS minus strand bedGraph file
--pas_file PAS_FILE pAs location file to be predicted expression level
--out OUT output file path
--threshold THRESHOLD peak length lower than threshold will be fiter out
--depth DEPTH total number of mapped reads( in millions)
--window WINDOW input length
--genome GENOME assembly name of the genome. i.e. hg19, hg38, mm10

For FACTOR_PATH, should I chang normalize_factor for my data ? https://github.com/christear/APAIQ_release/tree/main/regression

PAS_FILE is the output of apaiq?

--genome is required ?

Could you please write wiki or detail steps for apaiq ? Thanks.

--DB_file for polyA database file

How to download the polyA database file?
Thanks.

concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

I have been installing this APAIQ software for two days and am so tired. After I installed it successfully, it came out with this "concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.". The command is as follows:

python APAIQ.v.1.2.py \ --input_file=RNAseq.depth.bedGraph \ --out_dir=${output} \ --fa_file=${ref} \ --name=${group[0]} \ --model snu398_model.ckpt \ --t 30

who can solve it? I appreciate him/her sincerely! Thanks! Thanks! Thanks!

How to get pre-trained model and annotation db_file for mouse mm39 ?

You only provided them for human hg38 #5

Thanks.

apaiq fails to run

Hello developer of Apaiq,

I tried to run APAIQ using human RNA-seq data as follows:

source  ~/miniconda3/etc/profile.d/conda.sh
conda activate apaiq_env

set -euo pipefail

in=results/010.bedtools.bedgraph.out
bedgraph=(`ls $in/SRR15734*.bedgraph`)
sample=(`ls $in/SRR15734*.bedgraph | perl -p -e 's{.+/(.+?).bedgraph}{$1}'`)
genome_fasta=~/mccb-umw/shared/genome/human_gencode_v34/GRCh38.primary_assembly.genome.fa
depth=(140 143 147 139)

model=bin/APAIQ_release/model/snu398_model.ckpt
out=results/027.APAIQ.out/${sample[$i]}
mkdir -p $out

apaiq --out_dir  $out  \
      --input_file   ${bedgraph[$i]} \
      --fa_file     $genome_fasta \
      --model  $model \
      --t 8 \
      --depth  ${depth[$i]} \
      --name   ${sample[$i]}

But I got lots of warnings and error message:
**WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.decay
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.momentum
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-0.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-0.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-1.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-1.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-2.gamma
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-2.beta
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-3.gamma
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-3.beta
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-4.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-4.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-5.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-5.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-6.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'momentum' for (root).layer_with_weights-6.bias
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.
Traceback (most recent call last):
File "/home/haibo.liu-umw/miniconda3/envs/apaiq_env/bin/apaiq", line 33, in
sys.exit(load_entry_point('apaiq==1.0.3', 'console_scripts', 'apaiq')())
File "/home/haibo.liu-umw/miniconda3/envs/apaiq_env/lib/python3.7/site-packages/apaiq-1.0.3-py3.7.egg/apaiq/APAIQ_v1.py", line 163, in main
File "/home/haibo.liu-umw/miniconda3/envs/apaiq_env/lib/python3.7/site-packages/apaiq-1.0.3-py3.7.egg/apaiq/postScan.py", line 83, in annotatePAS
NameError: name 'sub_pas' is not defined
**

I think the error occurred in this line: https://github.com/christear/APAIQ_release/blob/main/src_v2/postScan.py#L83. Could you take a look.

Thank you,

Haibo

how to understand output files

Could you please give more details for input and output?
thanks.

Apply APAIQ to Plants

Feature (2L:22538087-23513798) beyond the length of 2L size (23513712 bp).

I tried to identify polyA sites of 12 RNA-seq samples using the apaiq command, but 5 of them failed to detect. These unsuccessful samples raised an error "Feature (2L:22538087-23513798) beyond the length of 2L size (23513712 bp)". The command is as follows: ${apaiq} \ --input_minus=Control_1_3Signal.Unique.str1.out.bg \ --input_plus=Control_1_3Signal.Unique.str2.out.bg \ --out_dir=${temp} \ --fa_file=${ref} \ --name=Control_1_3 \ --model ${model}/snu398_model.ckpt \ --t 10
Then, i checked the range of the bedGraph files (Control_1_3Signal.Unique.str1.out.bg and Control_1_3Signal.Unique.str2.out.bg) generated by STAR with the option '--outWigType bedGraph --outWigNorm RPM'. There is no Feature (2L:22538087-23513798) in the bedGraph files, and the range of all features in in the bedGraph files dot beyond the length of 2L size (23513712 bp). I was really confused with what happen. Can author tell me? Thank you very much.

How to filter PAS with low counts?

How many raw reads are required to support a reliable PAS? Which parameter?
We should discard it, if it is supported by only several reads (number of raw reads, but not the score of the PAS).

The input is bedgraph, but not BAM. So, we cannot know the number of raw reads?

Thanks.

christear / apaiq_release Goto Github PK

apaiq_release's Issues

How to quantify?

Quantification is very important for me; it is my purpose.

For FACTOR_PATH, should I chang normalize_factor for my data ? https://github.com/christear/APAIQ_release/tree/main/regression

PAS_FILE is the output of apaiq?

--genome is required ?

Could you please write wiki or detail steps for apaiq ? Thanks.

Recommend Projects

Recommend Topics

Recommend Org

Jobs