Bundibugyo ebolavirus analysis from the 2026 outbreak

This page presents a functional and phylogenetic analysis of the 2026 Bundibugyo ebolavirus genome sequences.

We applied Valthos's Cypress-1 model to analyze genome sequences from infected patients associated with the 2026 Bundibugyo ebolavirus outbreak.

Our key findings are:

  • Viral function scoring indicates that several amino acid variants in the 2026 sequences confer a substantial fitness advantage, which may facilitate spread. These amino acid variants are not novel; they have been observed before in either the 2007 Uganda outbreak or the 2012 DRC outbreak.
  • Host compatibility scoring does not indicate that the 2026 outbreak sequences are substantially more human-adapted than the 2007 Uganda or 2012 DRC reference sequences.
  • Cypress-1 scoring: We use our specialized protein language model, Cypress-1, which was adapted to biodefense workflows, to evaluate the functional consequences of the outbreak sequences' mutations on both viral function and host compatibility.
  • Reference comparison: To contextualize the model's scores, we compare the 2026 outbreak sequences against curated Bundibugyo ebolavirus references: the official reference sequence (RefSeq; from the 2007 Uganda outbreak) and a high-quality sequence from a 2012 DRC isolate.
  • Phylogenetics: We map the 2026 sequences onto a genome-wide Bundibugyo ebolavirus phylogeny. The three 2026 sequences form a clade distinct from the older sampled 2007 Uganda and 2012 DRC clades, likely representing an independent spillover event.
  • Note: Cypress-1 scores are experimental and should be interpreted with caution while validation continues.

2026 outbreak sequences were accessed through Pathoplexus. Accession numbers and original sequence links are included below. We thank the submitting groups for sharing these data.