Pencil sketches of wings on a sheet of old paper — Leonardo Da Vinci’s designs for a flying machine

SciServer addresses some of the most important challenges of modern science with a variety of innovative tools and approaches.

Petascale Data Management

Scientific advances have always relied on collecting, analyzing, and interpreting data – but today, the pace of data acquisition threatens to overwhelm the scientific method. In many fields, the doubling time of data collection is one year – meaning that every year, we learn more about our world than we had ever known through all of human history. Scientific datasets now routinely measure in the Terabytes or Petabytes, requiring new methods to store and process.

SciServer faces these daunting challenges by offering scalable database space to science data providers. SciServer’s databases are hosted on machines with big storage and fast I/O, and are heavily indexed for better query performance. But SciServer’s greatest contribution is to individual researchers: a set of easy-to-use tools for performing complex searches of big datasets, and personalized database space to store and analyze results. These new tools will revolutionize the way scientists make discoveries from 21^st century data.

Open Numerical Laboratories

Many modern research programs require detailed numerical simulations. These simulations are often so complex and time-consuming that they can only be done on the largest supercomputers; but with only a few supercomputers to go around, many researchers cannot run the software they desperately need for their science. Even when supercomputer time is available, researchers still need to efficiently search and analyze the results of their simulations to maximize knowledge gained.

SciServer solves these problems by offering the ability to run analysis codes directly on our servers, keeping the computation close to the underlying data. This approach will democratize access to supercomputing resources, and will enable an incredible variety of new science.

Science for All

SciServer will open up access to all these big data resources to researchers worldwide. But big data is not the only kind of data in science; the most important discoveries often take place in the “long tail” of small datasets collected by thousands of researchers around the world. Furthermore, datasets large and small are used every day in K-12 and college classes by the next generation of scientists and science-literate citizens. These “long tail” datasets come in very different file formats, and educational users have very different needs from practicing researchers.

SciServer will allow both types of users to access the same tools as researchers who work with Big Data. The result will be a robust, scalable system used by researchers and the public alike. SciServer tools will be a regular part of the toolbox for 21^st century professional and citizen scientists, and will be at the forefront of an amazing new era of scientific discovery.

Vision

Petascale Data Management

Open Numerical Laboratories

Science for All