This script maps motifs across multiple sequence using the Biopython's motif package.
File Inputs:
- alignment (fasta)
- TFBS Position Frequency Matrix.
File Outputs:
-.csv
file that outputs found TFBSs at each position, if any, in alignment.
Output data frame includes:
- position
- score
- sequence entry
- raw_position (from each sequence entry)
- strand (which direction the motif was found)
- motif_found (sequence motif at each postion)
The output file will be saved in directory script was ran.
Example output file: map_motif-alignment.fa-motif.fm.csv
Arguments:
- alignment fasta file
- TFBS Position Frequency Matrix
- optional -threshold score cutoff (outputs only scores greater than the specified threshold)
python map_motif.py alignment.fa motif.fm 3.2