SUM is a new alignment file format supported by URMAP. It is much more compact than SAM; file sizes are typically 95 to 97% smaller for the same alignments. SUM is designed to enable generating alignments in any desired format, e.g. SAM, if the reference genome sequence is provided.
To convert SUM to human-readable (BLAST-like) alignments, use the sum2aln command.
To convert SUM to SAM, use the sum2sam command. If SUM is converted to SAM using the reference genome alone, the SAM records have placeholder read labels and quality scores are not included (the QUAL field is set to '*'). If an analysis requires SAM files with original read labels and/or quality scores, they can be generated if the original FASTQ files are provided.
The SUM format is currently considered to be a proof of concept and should be considered subject to change.
There is currently no detailed description of the format.
If you are interested, please let me know and I'll
write something up for you.