Computational workflows and tools

We develop computational workflows and tools to help us analyze multi-omics data and make sense of amplicon and genomic data. Click on the badges below to learn more about each of the workflows and tools.

Viruses

HMPV IRMA module GitHub

A module for human metapneumovirus (HMPV) to be used with IRMA. The consensus sequences are derived from all whole genomes available on GenBank on October 18, 2024.

RSV amplicon sequencing GitHub Paper

A protocol for RSV-A and RSV-B sequencing that consists of two pools of overlapping amplicons. It is designed to work with current Arctic SARS-CoV-2 protocols.

COVID-19 Sequence Assembly GitHub

An R-based workflow for assembling COVID-19 genome sequences from Nanopore or Illumina sequence data.

Sequencing bioinformatic pipelines GitHub

Scripts and reference files needed to make consensus sequences for Influenza, SARS-CoV-2, and RSV.

RVTN Within Host Evolution of SARS-CoV-2 GitHub Paper

This repository contains code and small intermediate data files associated with the manuscript “In depth sequencing of a serially sampled household cohort reveals the within-host dynamics of Omicron SARS-CoV-2 and rare selection of novel spike variants”.

RSV Evolution Neutralization Project GitHub Paper

This repository has the data for Simonich et al (2025), a study on RSV evolution from the Bloom lab.

model-coinfection GitHub Paper

Code for simulations published in “The fitness consequences of coinfection and reassortment among segmented viruses depend upon viral genetic structure”.

Bacteria

QCD GitHub

Snakemake workflow for quality control and contamination detection of microbial Illumina whole-genome sequencing (WGS) data

nanoQC GitHub

Snakemake workflow for quality control and assembly of Nanopore (long-read) sequencing data.

Nanosake GitHub

Snakemake workflow for hybrid assembly using cleaned Illumina short reads and Nanopore long reads

SNPkit GitHub

Snakemake workflow for microbial variant calling

pubQCD GitHub

Snakemake workflow for quality control of datasets downloaded from public repositories such as NCBI and SRA

Data-Flow-SOP GitHub

Collection of standard operating procedures (SOPs) for processing short-read, long-read, and hybrid bacterial sequencing data

Phylokit GitHub

Snakemake workflow for recombination detection and phylogenetic tree reconstruction from a sequence alignment

Fungi

funQCD GitHub

Snakemake pipeline for de novo assembly and annotation of C. auris short-read sequencing data.

nanofunQC GitHub

Snakemake pipeline for de novo assembly and annotation of C. auris long-read sequencing data.

nanofunsake GitHub

Snakemake pipeline for de novo hybrid assembly and annotation, using a combination of C. auris long-read and short-read data.

cauris-data-flow GitHub

Set of scripts to standardize the output of multiple different runs of the above C. auris assembly pipelines.

cauris_dotplot GitHub

Tool to generate large numbers of synteny dotplots, to aid in visualizing structural variation across assemblies.

Technology & Data Core

phyloAMR GitHub

An R package that leverages ancestral state reconstruction and other phylogenetically-informed algorithms to characterize the evolutionary history of genome-influenced traits and investigate phenotype-genotype associations.

corHMM GitHub

New functionality included in this phylogenetics package as part of MIDGE to incorporate uncertainty into joint ancestral reconstruction analysis.