How to use SciServer
This page provides links, help guides, and other resources for the SciServer suite of tools.
- From the SciServer home page, click the Login to SciServer button
- A new page opens containing the SciServer Login Portal
- Click the Create a new account link
- At the Registration page, enter the following:
- User name
- Password (twice)
- Check terms of service
- Click Create Account
- You will receive an email, at the address you specified, containing a validation code. Check your email and copy the code.
- Enter the validation code and click Complete account creation.
- Return to the Login page and log in
- From the SciServer home page, click the “Login to SciServer” button
- A new page opens containing the SciServer Login Portal
- Enter your username and password
- Click Sign in
- You will be redirected to the SciServer Dashboard:
- From the SciServer home page, click the “Login to SciServer” button
- A new page opens containing the SciServer Login Portal
- Click the Forgot your password? link to go to the Reset Password page:
- Enter either your username or your email address, then press Submit.
- This will take you to a page to reset your password, which you type in twice:
- You will also be sent a verification code to the registered email address that you also need to enter
- Press Submit and your password will be reset.
- Go to the SciServer Dashboard:
- Access the Menu in the top right hand side corner:
- Select Change Password which will display the Change Password dialog box:
- Enter your new password (twice) and press the Update Password button
- This will change your password, and then display a message that you will be redirected to the Login Page to re-login
- You will then need to re-login using your new password to access your account again
Small icons can be found on most of the Dashboard pages with links to specific section of this How to Use SciServer document.
The Homepage for the Dashboard looks like this:
Menu and Functions
The Dashboard has a main “Menu Bar” across the top which is always visible in all views of the Dashboard, providing access to the following core features:
- Brings the user back to the Home Page
- Navigates the User to a Tab to manage Files and Folders
- Navigates the User to a Tab to manage user defined Groups and sharing
- Displays a drop-down menu allowing the User to launch any of the supporting SciServer Applications (Home, Compute, Compute Jobs, CasJobs, SciDrive, SkyServer, SkyQuery)
- Navigates the User to a tab displaying a tabular summary of the history of their activity with SciServer functions and applications
- A Menu dropdown that displays additional options for the user (access Profile, Access Help, Change Password, Sign Out)
In particular the User can access:
- Compute Jobs
- Activity Logs
Each of these shortcuts also displays information about recent activity on, or notifications about, each function.
Note that “Compute Jobs” is in both shortcut lists.
SciServer provides two ways to view your recent activities:
- The Dashboard displays the most recent activity related to File Management, Group Management, and Job Execution
- A separate “Activity View” accessible from the main Menu in the RHS corner provides a detailed table of all logged activities within SciServer, with filtering and sorting.
Dashboard Activity Summary
- On the main dashboard, a summary of Jobs, Groups and Files is provided as shown below
- In particular, Invites to New Group are shown, that the User can select to take them to the Groups tab in SciServer
SciServer provides a feature called ‘Groups’ for users to share their resources with their collaborators privately. The resources that users can share are file folders, databases, volume containers, and Docker images. Also SciServer provides a Group view which lists all the shared resources among group members.
SciServer Groups allow you to create lists of Users and to share resources, such as file Folders, to all of them at once. You can manage a team of people in a project by creating a group with the relevant users in, and allowing everyone to work from the same shared folder. Importantly, you can also make sure that no one who is not in the group can access the shared folder, so you can keep things private.
- Select the Groups tab in SciServer
- This will show the Groups View
- The left hand box will show you all the groups that you are a member of. As you select each one, the middle set of boxes will show you which resources have been shared with that group, and the right hand box will show you all the members in the group.
- To Create a New Group, click the “+” button on the left-hand “Groups” box:
- This will open the “New Group” dialog box:
- Enter a name for the Group and an optional Description, then press the “Create” button. This will show your new Group in the “Groups” box, and will show you as the only member in the box on the right-hand side:
- This is your new group!
- You can invite other SciServer users only to groups that you own, or that someone else has given you the grant privilege on
- In the groups view, go to the “members” box on the right hand side:
- Press the green “+” button to show the Add Members dialog box
- Select Users from the left hand panel. You can multi-select by holding down the “Ctrl” key and clicking on users. You can also search for users by typing characters into the Search box.
- Press the Invite User button:
- You will need to choose “Admin”or “member”:
- Admin: means that this user will be able to invite other users to your group
- Member: means this user will not be able to invite other users to your group
- This will add the users to your group with the status “Invited”
- Press the OK button in the lower right-hand corner to dismiss the dialog box
- NOTE: This will invite the users. They will need to accept the invitation before being a member of the group:
NOTE: the File Management features provided in the SciServer UI are different to the services provided by SciDrive, which is an older storage system that is still supported for legacy applications.
Permanent or Temporary Storage?
Users can create a number of top level “User Volumes” under which new folders and files may be added. User Volumes can be created in one of two different storage pools: a permanent pool called “Storage” and a short-lived pool called “Temporary”. Folders and files in User Volumes under “Storage” will be backed up and permanent, but there is a quota limit of 10GB. Folders and files in User Volumes unde “Temporary” are not backed up, and will be deleted after a particular time period, but there is no imposed limit or quota on how much data can be stored (because it will be deleted).
Most users should store their data and files in a “Storage” User Volume. The “Temporary” User Volumes are meant to be used as intermediate storage for SciServer Compute calculations.
- Login to the SciServer dashboard and select the “Files” tab:
- This will show the Files View at the top level of “User Volumes”:
- There are a number of User Volume operations available in this view
- Hover over each row to view available operations
- Click on the ellipsis button to view available operations:
- Share: If the user has the appropriate permissions, they can share a User Volume with other users, or groups of users. A User Volume created by the user is always shareable by them.
- Delete: If the user has the appropriate permissions, they can delete a User Volume. A user can always delete a User Volume they create.
- Edit: The User Volume name can be changed, and a description provided for it.
- Different icons refer to different levels of sharing:
- Selecting a Folder in the column on the left-hand side will open that folder up, one level at a time. It does not provide a tree view.
- There are a number of file and folder operations available in this view
- Hover on each to view available operations
- Click on the ellipsis button to view available operations
- Download: A file can be downloaded (but a Folder cannot)
- Perform multiple operations on files or folders, or on User Volumes (currently only for the Delete operation):
- Check one or multiple checkboxes
- The menu gets displayed with available operations on top
- Click on any menu item to perform an operation
- When running SciServer Compute, the same filesystem is presented by the Jupyter application, but is presented in a more traditional hierarchical manner with a full path access that supports working with files in a Linux Console:
- The operations on files and folders available in this view are provided by Jupyter.
- There is only the one top level of User Volumes, all subsequent lower levels are normal Folders
- A User Volume can be shared with others users and groups, but a normal folder cannot. This is very important.
- A User Volume can be selectively “mounted” in a Compute Container, and made accessible to a Jupyter Notebook. Folders at lower levels under a Volume Container cannot be.
- A User Volume cannot be moved or copied.
- On the SciServer “Files” tab, click the “Home” button to get back to the User Volume view:
- Press the “Create User Volume” button:
- In the dialog that pops up, enter a name and optional description:
- Select a “Root Volume,” which means decide whether to create the new User Volume in a permanent and backed-up Storage pool, or to create it in a Temporary pool, knowing that it will be deleted after a certain period of time (defined by the SciServer Data Storage Policy).
- Press the Create User Volume button and the new User Volume will be created.
- Files cannot be uploaded directly to the top-level “User Volume” level.
- Files can be uploaded to any Folder inside an existing User Volume, including in that User Volume’s top-level directory
- Navigate into a User Volume and you will see an “Upload” button:
- Press the Upload button to display a dialog box:
- You must navigate to a File, select it then press the “Open” button
- This will then show the file you just uploaded:
To move or copy files between folders in SciServer, use the Files tab in the SciServer Dashboard:
- Select the Files tab
- Navigate to the files that need to be copied – hover the mouse over the file to show the “icons” (one of which is Copy, and one of which is Move). If you wish to move or copy are multiple files, select them all using the checkboxes, and select the “Copy” or “Move” button that appears at the top:
- When you press the “Copy” or “Move” button (in either of the above cases), this dialog will appear:
- Navigate to where you want to copy or move the files, and then press the “Paste” button:
NOTE: You must have write permissions on the destination folder in order to move/copy a file or folder there!
SciServer Compute is an application that allows users to easily create and run Jupyter Notebooks containing code and instructions to analyze and process SciServer hosted data sets. SciServer provides a rich API to access all aspects of SciServers resources, including databases, Filesystems, user and group management, and even Compute Jobs.
There are a couple of steps involved, which SciServer makes easy:
- Create a new “Container” to run the Jupyter Notebook in. A container defines the “environment” for the user, and is configurable
- Open the container, which will start Jupyter, and create, save and execute Notebooks.
- The full capabilities of Jupyter (which is a third party application) are available to the user and will not be covered here.
An important step in setting up a Container environment is specifying what external file systems will be accessible to the Jupyter environment.
A Container in SciServer Compute is a defined environment within which Jupyter Notebooks can be run. Its is technically a Docker Container (Docker is the technology used), and provides a way to isolate the user and their code from the rest of the SciServer system, and other users.
A Container in SciServer Compute is a “long-lived” resource, and as such there are some resource management issues you need to know about:
- Each User can create up to 3 containers at any given time. If you need another one, you need to delete one first.
- Containers have a lifecycle, and can be “running” or “stopped”. SciServer keeps Containers running for a certain period of time, even if the User is not actively working with it, to ensure that when a user comes back to it, it starts up nice and fast without delays.
- Running Containers consume system resources like memory etc, so after 24hrs (TBC) the Container will be stopped. This has two effects:
- If code was running, it will be terminated
- It will take a bit longer to start up the Container next time it is accessed
- SciServer could, and sometimes does, delete containers that have not been used for a long time. This frees up storage resources.
- Whereas data (files, folders) can be stored “in” the container, you should never do this for any data that you need to keep. Always store data files in the Storage or Temporary storage pools, which are external to the container. Data in these storage pools are accessible when all containers are closed or deleted, and the same data is accessible across any containers that includes those Volume Containers in their environment.
Creating a new Compute Container is easy! However there a few parameters required to define the compute environment that you need.
- Access the Compute Application by clicking on the “Compute” icon on the SciServer Dashboard:
- OR from the “Apps Menu”
- This will take you to the “Compute dashboard”
- You will see a large text box explaining how to use the Storage capabilities to ensure that you do not lose data if you accidentally store it in the Container itself.
- Press the “Create Container” button, and this will pop a dialog box:
- The following needs to be entered:
- Container Name: you choose this
- Domain: This is a drop down from which you should most always leave at the default value “Interactive Docker Compute Domain”.
- Image: This define a “software environment” for the Jupyter notebooks that you want to run. The images contain libraries tailored to different needs. For the most part you will choose an image that supports the language you are interested in (python, R, Matlab etc), but there are “specialty” science domain specific images that you may have access to if the creator of those images has shared it with you. Additional information on “Images” can be found here at the Compute Images help page.
- User Volumes: This is a list of all User Volumes that you have access to, either which you own, or which have been shared with you. When you select some of these, on container creation these folders will be “mounted” and will be accessible as if they were local files. This makes file access and management much simpler. NOTE: they will be mounted with the same access controls as you would have in the SciServer File UI i.e. “readonly” or “readwrite”.
- Data Volumes: These are a series of special Data Volumes that are either shared publicly with all users, or for which you have been given special access privileges to see. Again, selecting these Volumes will mount them, and make them appear “local” in the Container. These will always be mounted “readonly”. Additional information on “Data Volumes” can be found at the Available Datasets page.
- Once created, it is not possible to add additional User Volumes or Data Volumes, so you should be sure to get this right.
- Pressing the “Create” button will create a new Container and show it in the table with the name you provided.
- Clicking on the link in the column titled “name” will launch Jupyter
- Start one of your Compute Containers to Launch Jupyter:
- A new window will open showing your filesystem, which is accessible through Compute.
- Navigate to the directory in which you want to create your notebook.
- Select New from the top right corner of the interface to open the dropdown menu
- Select the language that you would like the new notebook to use
- The notebook will open in a new window
Compute Jobs allows a user to run a Jupyter Notebook or a standard script in offline batch mode. The same exact capabilities are provided as for Interactive Compute:
- Compute Images and software environment
- Mounting external volume folders
Executing Job will put it in a queue, and it will be run when there are resources available on the server cluster.
You might create a Job for the following reasons:
- Executing your notebook may take a long time and you want to set it running and do something else without worrying about browser sessions timing out etc.
- You may develop your code interactively to make sure the algorithm works, using a small amount of data to test it out. But you really want to run your code against a full dataset which will require massive resources for memory and CPU, as well as execution time.
- You are provided with far more resources (CPU and memory) to execute a Job than you are in an Interactive Session.
SciServer allows you to define two “types” of job:
- Specify a script to execute, or a command line command
- Specify an existing Jupyter Notebook that you have previously developed
The second of these is the most useful in that you can develop your Jupyter Notebook interactively then “submit” the exact same notebook as a Job.
Creating a new Job is easy! We explain how to create a notebook based job, but creating a script based job is very similar.
- Go to the Compute Jobs Page:
- Click “Run Existing Notebook”
- On the ‘Compute Domain’ Tab:
- Choose the Compute Domain, for which in most cases currently there will only be one option
- Optionally enter a “Job Alias’ to easily identify your Job later
- On the ‘Compute Image’ Tab:
- Pick the ‘Image’ you need to use.
- Each image contains different tools and programming language support.
- (Compute Images are described in more detail at the Compute Images help page)
- On the ‘Data Volumes’ tab:
- Select all the data volumes with appropriate permissions needed for this job.
- On the ‘User Volumes’ Tab:
- Select all the Folder systems that you would like to be made accessible to your Compute notebook
- For Folders that you own, or that have been shared with you and you were given the appropriate permissions, you can select whether a given folder is read only or writable. Folders that you do not own will be readonly by default.
- On the ‘Notebook’ tab:
- Navigate to the Notebook you wish to use as the basis for your Job, and select it
- Enter any additional parameters that the Notebook can read in to affect how the code is executed
- Choose a directory where the output results will go
- By default these will go to jobs within which subdirectories will be created and your results written to.
- Alternatively you can choose a specific directory to output results. The directory you choose will be a ‘root’ within which subdirectories will be created and your results written to.
- When everything has been entered you can press ‘Create Job’, and the Job will be submitted, and displayed in a Jobs Table view:
- The Table will be refreshed every several seconds, telling you the status of the Job.
- While the Job is still running there will be a red “X” button, and pressing this will Cancel the job.
- Pressing the down triangle on the RHS will expand the view and show more information about the Job. This is what you see for a completed Job:
- This gives status information about the Job, the path to the location of the results, and links to the results output
- The results output will go to the location specified in the Job Definition
- In the Jobs Table, expand the job of interest:
- This give status information about the Job, the path to the location of the results, and three hyperlinks:
- Browse Working Directory will take you to the Dashboard Files tab and show you the output files as well as the original Python Notebooks.
- Download Standard Output and Download Standard Error will allow you to download these two text files as appropriate. They will be downloaded according to your Browser settings.