Denoising Amplicon Sequence Data Using USEARCH and UNOISE3

During the Introduction to Metagenomics Summer Workshop we discussed denoising amplicon sequence variants and worked through Ben Callahan’s DADA2 tutorial. During that session, I mentioned several other approaches and algorithms for denoising or clustering amplicon sequence data including UNOISE3, DeBlur and Mothur. I also mentioned I would try to post some example workflows for some of these other approaches to highlight the similarities, as well as the differences. It looks like I am just now getting around to it.

Downloading Amplicon Sequence Runs from the NCBI SRA

A collaborator recently asked if I could help pull down a few thousand sequence files from the NCBI Sequence Read Archive (SRA) for a secondary analysis. This is a short post primarily to help me (and hopefully others) remember how to do this once you have a set of SRR IDs of interest. While I came across several great resources providing information on how to download SRA files using the SRA Toolkit, I wanted to retain just the basics, and some example code, should this type of request come across my desk again in the future.