SciServer makes a number of datasets available online in the form of Public Data Volumes. These data volumes can be mapped into new containers created through SciServer Compute. See the instructions on How to create a new container for information on how to make these datasets available.
Johns Hopkins Turbulence Databases
The Turbulence environment on SciServer provides functionalities to access directly datasets archived and maintained on the Johns Hopkins Turbulence Databases (JHTDB). The system contains space-time data of turbulent flows from the output of world-class high-resolution direct numerical Navier-Stokes simulations.
Recount2 provides processed and summarized expression data for over 70,000 human RNA-seq samples from the Sequence Read Archive (SRA), The Cancer Genome Atlas (TCGA), and The Genotype-Tissue Expression (GTEx) project (https://doi.org/10.1038/nbt.3838). The associated Bioconductor package provides a convenient API for querying, downloading, and analyzing the data. Each processed study consists of meta- and phenotype data, the expression levels of genes and their underlying exons and splice junctions, and corresponding genomic annotation. By taking care of several preprocessing steps and combining many datasets into one easily-accessible website, we make finding and analyzing RNA-seq data considerably more straightforward.
Sloan Digital Sky Survey Data Archive Server (SDSS DAS)
This volume contains all the raw and processed file-based data from Data Release 7 (DR7) of the Sloan Digital Sky Survey (SDSS). The raw and pipeline-processed imaging and spectroscopic data products are available here (mostly) in binary FITS format. The data on the SDSS-DAS volume can be accessed via SciServer Compute using the standard file access python tools. A copy of this data is also accessible via the SDSS DAS website, and the catalog version of this data is available from the SDSS DR7 SkyServer.