Presence/absence is especially dubious with amplicon sequencing, so presence/absence metrics such as richness and unweighted UniFrac should generally be avoided. Small counts are less reliable than high counts because they can be entirely spurious due to cross-talk and spurious OTUs. Therefore, it is generally better to use metrics which put lower weight on low-frequency OTUs, such as Shannon entropy or Simpson. With Simpson, the value may be dominated by a single high-frequency OTU, and this is suspect because the high abundance may be due to bias (e.g., because that species has 10 copies of the 16S gene while the next few only have one or two). Conversely, if there is no high-abundance OTU this may also be an artifact (e.g., because the sample is in fact dominated by a single species, but that species has two primer mismatches so is strongly suppressed by PCR). My recommendation is generally to use Shannon entropy as a general indication of changes in alpha diversity, though differences can be hard to interpret even without biases. A difference in entropy can be due to a change in the number of OTUs, the shape of the frequency distribution, or both. The differences can be better understood by visualizing the frequency distributions of each group (there will be new features supporting this in usearch v11).
The UniFrac authors' stated justification for its tree-based design is that OTUs with higher sequence similarity should be treated as more similar when comparing samples because they tend to fill similar ecological niches. I don't know if this is a valid argument from the point of view of microbiology -- if anyone can point me at evidence either way, please let me know. With OTUs generated by QIIME, which are generally very noisy, it is essential to use UniFrac to suppress differences which are due to noise. However, with the much more accurate OTUs generated by UPARSE or UNOISE, I believe that UniFrac is generally a bad choice of beta diversity metric because it has very low sensitivity to differences in OTUs with high sequence similarity. Such differences are surely biologically significant even if the OTUs fill similar niches. My recommendation is generally to use weighted Jaccard unless you have a good reason to use something else.