abstract |
The invention provides methods for identifying a microorganism by aligning sequence reads to a graph, such as a directed acyclic graph (DAG), that contains condensed sequence information of a conserved region from multiple known microorganisms. The DAG can be constructed by obtaining sequence information of known reference microorganisms. The DAG also includes the identities of the known microorganisms that correspond to particular paths. Sequence reads obtained from an unknown sample can thus be aligned to paths in the DAG using an alignment algorithm, and the identity of a microorganism in the sample can be determined based on which path in the DAG to which the sequence reads align best. |