To take an existing set of bins from a metagenomic analysis and repatriate non-binning contigs, remove contigs that contaminate bins, and eventually merge/split bins using a combination of de-novo and reference-based methods.
Starting point:
- Raw assembly file(s)
- Bin set
- Bin identification file
- This is a tab-delimited file with two columns: bin_name and contig_name
- The bin_name should be in the format: "Bin.[number]"
- The contig_name should be in the format: ">[contig_name]"
a. Shared taxonomy b. ANI patterns across samples c. Non-overlapping gene content