Home Software Services About Contact     

SAM files

The SAM format (see Wikipedia article) is widely used in genomics. The official specification is in these documents:

  SAM v1 format specification (PDF)
  SAM v1 tags specification (PDF)

To convert SAM to human-readable (BLAST-like) alignments, use the URMAP sam2aln command.

SAM is complicated and not well standardized, especially for paired reads, with each read mapper generating its own idiosyncratic version. The main variations are the order in which paired alignments appear (some programs always write R1 first, while others change the order depending on whether R1 or R2 aligns to the plus strand of the reference), which flags are used (in particular which flags are used, if any, to indicate which alignment is R1 and which is R2), and which optional tags are included, for example the MD:Z string for variant calling is supported by BWA and bowtie2 but not by minimap2, SNAP or FSVA.

My goal with URMAP is to generate SAM files which are compatible with as many analysis software packages as possible. If you run into a compatibility problem, please let me know and I'll do my best to fix it quickly.

URMAP SAM files (subject to change)
The order reads appear in the SAM file is not preserved from the FASTQ input unless -threads 1 is used.
With paired reads, a given pair always appears consecutively (R1 before R2) regardless of the number of threads.
The MD:Z tag is not supported (let me know if you need it).
Secondary alignments are never included.