Astronomy Datasets – SciServer

A map of the Universe from the Sloan Digital Sky Survey

SciServer is the official science platform for the Sloan Digital Sky Survey (SDSS) catalog data. SDSS has been mapping the universe in unprecedented detail for 25+ years. SciServer hosts the entire catalog data set that is served by the SDSS Catalog Archive Server (CAS) since the very beginning, including data release 1 (DR1, released in 2001) through the latest data release (DR19). Each DR is cumulative (Includes all previous DRs), and provides access to nearly half a billion images and millions of spectra, along with many derived datasets like pre-computed cross-matches with other popular catalogs and value-added catalogs giving more information about certain classes of objects.

SciServer also hosts a local copy of all the SDSS raw, file-based data that is served by the Science Archive Server (SAS) in FITS format. This allows users to do science that requires both the catalog and raw data on the same server with minimal data movement.

In addition to SDSS, SciServer includes the full public data releases from a number of other astronomy projects, covering wavelengths from radio to infrared.

Each dataset is listed in an expandable section below. Expand it to learn what each dataset contains and how to access it.

SDSS Datasets

> Sloan Digital Sky Survey (SDSS) Catalog Data +

What it is: Science-ready catalogs derived from the SDSS raw (FITS) data by the SDSS pipelines and ingested into a database management system for optimized access in the SciServer Science Platform via specialized query tools in interactive and batch mode. The stalwart SkyServer and CasJobs portals are now augmented with Jupyter notebook access via SciServer Compute, providing fast access and server-side analysis tools for measurements of millions of stars, galaxies, and quasars – such as magnitudes, redshifts, classifications, and chemical abundances.

How to access: You can still access the data directly with SkyServer and CasJobs, but now you can also import the SciServer libraries into a SciServer Compute notebook and use the SciServer.CasJobs commands to query the DR19 context. See the Welcome example notebook in the Getting Started Data Volume for an example.

> SDSS Science Archive Server (SAS) Data +

What it is: This volume contains the public data from the Science Archive Server (SAS) of the Sloan Digital Sky Survey (SDSS). It contains data products from multiple surveys across all data releases, from DR8 to the most recent DR19.

How to access: Create a Compute container and mount the SDSS SAS Data Volume

> SDSS-IV eBOSS Spectra +

What it is: spectra observed by the SDSS, stored as FITS files.

How to access: create a Compute container and mount the SDSS Spectra Data Volume

For more information on how to use the spectra, see the SDSS Spectra dataset page.

> SDSS-IV MaNGA Integral Field Unit (IFU) Spectra +

What it is: The reduced products of MaNGA data, with the latest from Data Release 17 (DR17). The pipeline-processed spectroscopic data products available in this image consist of 3-d data cubes, row-stacked spectra from the Data Reduction Pipeline, and 2-d analysis maps and 3-d model cubes from the Data Analysis Pipeline. The complete volume of MaNGA data can be found on the SDSS Science Archive Server (SAS).

How to access: Create a Compute container and mount the Manga Data Volume

> SDSS Associated Data +

Example HI spectra (right) of three MaNGA galaxies (optical images at left). — Example HI spectra (right) of three MaNGA galaxies (optical images at left)

What it is: The SDSS Associated Data Data Volume provides easy access to useful datasets from the Sloan Digital Sky Survey that are not part of the official SDSS data releases (the latest of which is now Data Release 17). Currently this data volume includes one dataset: HI-MaNGA, described below. We will continue to add new datasets, including future SDSS value-added catalogs.

How to access: Create a Compute container and mount the SDSS Associated Data Data Volume.

HI-MaNGA: HI followup observations of MaNGA target galaxies

The HI-MaNGA dataset consists of followup observations of MaNGA galaxies in the HI (21 cm) wavelength, using the Green Bank Telescope. The observations were designed to address scientific questions related to stellar evolution and gas accretion in various types of galaxies. The final dataset will include most galaxies in the MaNGA catalog with z < 0.05.

For more information about the HI-MaNGA dataset, see its description page on the SDSS website.

> APOGEE FIRE Mock Catalogs +

What it is: The ApogeeFire database context contains a large catalog of mock stars in mock Milky Way-like galaxies, created using the FIRE2 framework (Wetzel et al. 2016). Mock data for each mock star includes radial velocity, proper motion, chemistry (10 chemical elements are tracked in the simulation), parallax, and photometry in the Gaia bands.

How to access: Import the SciServer libraries into a SciServer Compute notebook and use the SciServer.CasJobs commands to query the DR19 context. See the Welcome example notebook in the Getting Started Data Volume for an example. Import the SciServer libraries into a SciServer Compute notebook and use the SciServer.CasJobs commands to query the ApogeeFire context. See the apogee_fire example notebook in Astronomy folder inside the Getting Started Data Volume for an example.

For more information on what data the catalogs contain, see the APOGEE Fire Mock Catalogs dataset page.

> SDSS-I/-II Data Archive Server

This volume contains all the raw and processed file-based data from Data Release 7 (DR7) of the Sloan Digital Sky Survey (SDSS). The raw and pipeline-processed imaging and spectroscopic data products are available here (mostly) in binary FITS format.

The data on the SDSS-DAS volume can be accessed via SciServer Compute using the standard file access python tools. A copy of this data is also accessible via the SDSS DAS website, and the catalog version of this data is available from the SDSS DR7 SkyServer.

Other Astronomy Datasets

SciServer includes many more astronomy datasets. It also includes an XMATCH database, allowing you to cross-match astronomical sources across multiple sky surveys.

> XMATCH: Cross-matching astronomical datasets +

What it is: The XMATCH database contains data from more than 50 astronomical surveys, along with SQLxMatch code to run 2-dimensional spatial cross-matches between these catalogs, or against catalogs of point sources that you upload into your MyDB.

How to access: You can run the SQLxMatch code and query the XMATCH tables from a Jupyter Notebook running in a SciServer Compute container. An example notebook is available in the shared Getting Started data volume, in the Astronomy/Cross-Match of Source Catalogs folder. You can also download the example notebook directly from the SQLxMatch GitHub repository. Once you have the example notebook in your persistent folder, you can adapt it to meet your needs.

Documentation on the SQLxMatch code is available in the README of its GitHub repository.

> Faint Images of the Radio Sky at Twenty cm (FIRST) +

What it is: The Faint Images of the Sky at Twenty cm (FIRST) is an all-sky survey at radio wavelengths.

How to access: To query this catalog using CasJobs, select FIRST from the Context menu just above the query window. To access it using SciServer Compute, specify its name as the context in the appropriate place in your SciServer.CasJobs.executeQuery(sql, context) or SciServer.CasJobs.submitQuery(sql, context) commands: for example, asSciServer.CasJobs.executeQuery("select top 10 * from PhotoObjAll", context="FIRST")

> Two-Micron All-Sky Survey (2MASS) +

What it is: The Two Micron All-Sky Survey (2MASS) is an all-sky survey at infrared wavelengths. SciServer offers access to the Point Source Catalog of the 2MASS All-Sky Data Release, enabling studies of populations of stars and other resolved objects in the Milky Way.

How to access: To query this catalog using CasJobs, select 2MASS from the Context menu just above the query window. To access it using SciServer Compute, specify its name as the context in the appropriate place in your SciServer.CasJobs.executeQuery(sql, context) or SciServer.CasJobs.submitQuery(sql, context) commands: for example, asSciServer.CasJobs.executeQuery("select top 10 * from PhotoObjAll", context="2MASS")

> Gaia +

What it is: Gaia is a European Space Agency mission to find distances and properties of more than one billion stars in our Milky Way Galaxy.SciServer hosts the complete catalog data for Gaia Early Data Release 3 (Gaia EDR3).

How to access: To query the Gaia EDR3 catalog using CasJobs, select GaiaEDR3 from the Context menu just above the query window. To access Gaia EDR3 data using SciServer Compute, specify its name as the context in the appropriate place in your SciServer.CasJobs.executeQuery(sql, context) or SciServer.CasJobs.submitQuery(sql, context) commands: for example, as

SciServer.CasJobs.executeQuery("select top 10 * from gaia_source", context="GaiaEDR3")

> Sky Mapper Southern Survey (SMSS) +

What it is: The SkyMapper Southern Survey (SMSS) is a 6-band optical survey conducted with the Australian National University’s 1.3m SkyMapper Telescope at Siding Spring Observatory in Australia. The telescope has a 32-CCD mosaic camera, with 268 million pixels, covering 2.4° x 2.4°. The SMSS filter set (Bessell et al. 2011) is comprised of u, v, g, r, i, and z, with differences from the SDSS and LSST/VRO bandpasses that facilitate novel scientific applications.

SMSS published its Fourth Data Release (DR4) in February 2024, covering from the South Celestial Pole to Declinations of +16°, with some fields observed up to +28°. Approximately 700 million unique sources have been observed from 15 billion photometric data points measured from over 400,000 images acquired between March 2014 and September 2021.

The typical 10-sigma depths for each field range between 18.5 and 20.5 mag, depending on the filter, but certain sky regions include longer exposures that reach as deep as 22 mag in some filters.

For more information about SMSS and DR4, see the SkyMapper website or consult the SMSS DR4 paper (Onken et al. 2024).

How to access: SciServer hosts the master, images, ccds, and mosaic tables from SkyMapper DR4. To query these tables using CasJobs, select SkymapperDR4 from the Context menu just above the query window. To access it using SciServer Compute, specify its name as the context in the appropriate place in your SciServer.CasJobs.executeQuery(sql, context) or SciServer.CasJobs.submitQuery(sql, context) commands: for example, as:

SciServer.CasJobs.executeQuery("select top 10 * from master", context="SkymapperDR4")

> Minor Planet Center orbital elements (MPCORB) +

What it is: The Minor Planet Center Orbit (MPCORB) database contains orbital parameters for more than 700,000 asteroids from the International Astronomical Union’s Minor Planet Center. The dataset available through SciServer is a snapshot of the database as of August 17, 2017.

How to access: To query this catalog using CasJobs, select MPCORB from the Context menu just above the query window. To access it using SciServer Compute, specify its name as the context in the appropriate place in your SciServer.CasJobs.executeQuery(sql, context) or SciServer.CasJobs.submitQuery(sql, context) commands: for example, asSciServer.CasJobs.executeQuery("select top 10 * from mpcorb", context="MPCORB")

> Galactic Archaeology with HERMES (GALAH) DR4

What it is: The GALAH survey measured stellar parameters and abundances of more than 900,000 stars in the Milky Way. The latest public release of GALAH is Data Release 4 (DR4).

How to access: Create a SciServer Compute container and mount the GALAHDR4 Data Volume.

For more information about what measurements are available, see the GALAH DR4 dataset page.

> LAMOST DR8v2 +

What it is: The Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) is a Chinese national scientific research facility operated by the National Astronomical Observatories, Chinese Academy of Sciences.

It is a special reflecting Schmidt telescope with 4000 fibers in a field of view of 20 square degrees. The LAMOST survey provides flux – and wavelength-calibrated, sky-subtracted spectra in the wavelength range of 3700-9000 Ångstroms for many types of astronomical objects. Since October 2018, LAMOST started the second stage survey program containing both low- and medium-resolution spectroscopic surveys, and the medium-resolution spectroscopic survey includes two surveys, i.e., the time-domain and the non time-domain surveys.

The eighth LAMOST data release (LAMOST DR8) includes observations until June 2020. The Low-Resolution spectroscopic survey (LRS) General Catalog contains 10,633,515 spectra, of which 10,336,752 are stars, 224,702 are galaxies, 72,061 are QSOs, and there are 9,563,115 spectra with g-band SNR or i-band SNR larger than 10. The Medium-Resolution spectroscopic survey (MRS) General Catalog contains 5,975,982 spectra, where there are 1,465,789 and 4,510,193 spectra for the non time-domain and time-domain surveys, respectively. (Note: This catalog totally contains 22,141,635 entries, which is different from the two statistics numbers above. For the number of the non-time-domain survey, we only count each of the coadded spectra as one of the 1,465,789 spectra, while for the time-domain survey, every single exposure of the B band plus R band were regarded as one of the 4,510,193 spectra.)

SciServer hosts version 2 of the DR8 release (DR8v2). In addition to the LRS and MRS General Catalogs, LAMOST DR8 provides many other catalogs, including stellar parameter catalogs, the Observed Plate Information Catalog, the Input Catalog and others. Refer to the data product description pages for the Low-Resolution Spectrocopic Survey (LRS) and Medium-Resolution Spectrocopic Survey (MRS) for more information on this data release.

How to access: Import the SciServer libraries into a SciServer Compute notebook and use the SciServer.CasJobs commands to query the LAMOSTDR8v2 context.

> Two-Degree Field (2DF) Galaxy Redshift Survey +

What it is: The Two-degree-Field (2DF) Galaxy Redshift Survey is an all-sky survey at visible wavelengths, with the goal of understanding the large-scale structure of galaxies.

How to access: To query this catalog using CasJobs, select 2DF from the Context menu just above the query window. To access it using SciServer Compute, specify its name as the context in the appropriate place in your SciServer.CasJobs.executeQuery(sql, context) or SciServer.CasJobs.submitQuery(sql, context) commands: for example, asSciServer.CasJobs.executeQuery("select top 10 * from PhotoObjAll", context="2DF")

> Galaxy Evolution Explorer (GALEX) +

What it is: The Galaxy Evolution Explorer (GALEX) is an ultraviolet space telescope that operated from 2003 to 2012. During that time, it observed hundreds of thousands of galaxies, helping to determine distances and star formation rates throughout the universe. SciServer offers access to all GALEX releases up to and including its final complete dataset, GALEX Release 6 (GR6). The GALEX data releases are referred to in SciServer as GALEXGR6, GALEXGR5, etc.

How to access: To query one of the GALEX databases using CasJobs, select its name from the Context menu just above the query window.To access one of the GALEX databases using SciServer Compute, specify its name as the context in the appropriate place in your SciServer.CasJobs.executeQuery(sql, context) or SciServer.CasJobs.submitQuery(sql, context) commands: for example, as

SciServer.CasJobs.executeQuery("select top 10 ra, dec from acsData", context="GalexGR6")

> High-Energy Astrophysics (HEASARC) +

What it is: The HEASARC data volume contains a copy of all of the public data hosted at the High-Energy Astrophysics Science Archive Research Center (HEASARC). For information about the various missions available and how to use specific datasets, please see the HEASARC website and/or contact our helpdesk from that site’s Feedback link at the bottom.

How to access: The HEASARC data volume also includes a software area for miscellaneous additional things such as interactive cookbooks that are under development. Some startup instructions can be found on the HEASARC SciServer documentation page. The software environment to analyze these data can be found in the Compute Image called HEASARCv6.28.

Back to Datasets page