Biobot deposits raw sequence data (i.e., FASTQ files) from submitted wastewater samples to NCBI’s SRA site making the data publicly available for anonymized research.
FASTQ files (text files with sequence and quality info) are available for free download and will require a bioinformatician to interpret.
Samples have location down to the state of origin.
To access the raw sequencing data corresponding to one of your samples follow the steps below.
1. Find the NWSS sample ID for your sample of interest
a. NWSS sample ID for Biobot samples will be 4 numbers, a period, and 1-2 letters, e.g. “1234.AB”
b. This can be found in the footer of your PDF report, labeled as “Internal Kit ID” (in the screenshot example below, the internal sample ID would be 3658.J)
c. This ID can also be found in DCIPHER
d. Reach out to email@example.com if you are having trouble
2. Go to NCBI biosample search https://www.ncbi.nlm.nih.gov/biosample/ and input sample ID into the search bar
a. This will take you to the page corresponding to your sample. Select “SRA” towards the bottom of the page.
b. Not every sample is sequenced–we generally sequence one sample per week per location. If you search for a non-sequenced sample, you may see no result or an irrelevant result (i.e., non-Biobot sample with a similar name).
c. This will take you to the SRA metadata page for the sample. The “Run” accession is located under the “Run” column in the table towards the bottom of the page.
3. After obtaining the “Run” accession, this guide from NCBI describes multiple ways to download the corresponding raw sequencing data.
Here, we show how to download sequencing data for one sample at a time through the SRA website.
a. Select the “Run” accession mentioned above. This will take you to the Sequence Read Archive for your sample. Select the “Reads” tab.
b. Select “Filtered Download”
c. Select your format of choice for download, then “Download” your data.
4. If you instead want to browse all samples/sequencing data, you can go to our bioproject: https://www.ncbi.nlm.nih.gov/bioproject/839090