Skip to content

Preliminary Results

Below are some tentative results that we will show until the preprint comes out.

Algorithm outline

At a high level, myloasm uses a string graph approach. Briefly, myloasm:

  1. aligns reads-to-reads, splits chimeric reads, and estimates coverage per read
  2. uses polymorphic, strain-specific k-mers (we call them SNPmers) to find true sequence divergence between reads
  3. obtains a high-resolution overlap graph and finds walks (contigs) with consistent coverage using an annealing-inspired optimization approach
  4. aligns reads-to-contigs and polishes using partial order alignment (SPOA)

Results

alt

Fig. 1. Metagenome-assembled genome (MAG) recovery for Nanopore R10.4 simplex + PacBio HiFi sequencing datasets. Alignment + binning was done with minimap2 and SemiBin2; results are evaulated with CheckM2.

alt

Fig. 2. Completeness + contamination results from CheckM2 on individual contigs for the different assemblers. Dashed line indicates circular complete MAGs (> 90% complete, < 5% contaminated and circular).

Dataset accessions

Dataset (ONT R10.4) Accession Dataset (HiFi) Accession
Oral 1 (Kiguchi et al.) DRR582205 Hot Spring (Kato et al.) DRR290133
Oral 2 (Kiguchi et al.) DRR582179 Anaerobic Digester (Benoit et al.) ERR10905741
Gut 1 (Minich et al.) SRR29980972 Chicken Gut (Zhang et al.) SRR19683891
Gut 2 (Minich et al.) SRR29980959 Human Gut (Gehrig et al.) SRR15489018
Gut 3 (Minich et al.) SRR29980980 P. Seawater (Priest et al.) ERR4920901
S. Seawater (Sidhu et al.) ERR9769281