This guide is a companion to the BioHPC introductory training session, which is required for all new users. Essential information about BioHPC and how to use our services can be found here. Use it as a quick reference, and in conjunction with the other detailed guides to find your way around our systems. Click the links below to quickly jump to the information you need. Any questions or suggestions can be emailed to email@example.com or submitted using the 'Comment on this page' link in the top right of this site.
Using BioHPC Storage
Using the BioHPC Compute Cluster
Who and what is BioHPC?
BioHPC is the high-performance computing group at UT Southwestern. We provide the hardware, services and assistance necessary for UT Southwestern researchers to tackle large computational problems within their research work. BioHPC is different from many HPC centers, due to the diverse range of users we have. We aim to offer easy-access to our systems using a wide range of cloud services on the web. Users can benefit from our large storage systems and powerful compute cluster without needing Linux and HPC expertises.
BioHPC’s core hardware currently consists of a 172-node compute cluster, and over 3.5 PB of high-performance storage. Follow the links to our systems page to find out more about our systems on this site. Most users will become familiar with the Lamella cloud storage gateway, and our compute cluster Nucleus.
A team of 7 staff manages BioHPC systems, collaborates on research projects, and provides general support to users. Led by the BioHPC director, Liqiang Wang, we have a range of expertise covering bioinformatics work, mathematical simulation, software development and hardware support. We can be contacted via firstname.lastname@example.org.
What services does BioHPC provide?
How do I register for an account?
Our initial account registration process is automated. Fill out the online registration form, and watch for an email to confirm your registration details. Once your account is registered you’ll need to attend a compulsory new user introduction session. These take place on the first Wednesday of each month, at 10:30 am in seminar room NL6.125
When will my account be activated?
Your account will be fully activated for access to our systems once you have attended the introductory training session. Please make sure you sign-in when you take the session, so that we know you have received the required training. Accounts are generally activated in the afternoon following introductory training. In very busy time, with a large number of user registrations, you may need to wait a few hours before receiving your activation email.
In exceptional circumstances, if we have the agreement of your department, the training requirement can be waived for early activation at the request of your PI. In this case we still strongly recommend you attend the training session as soon as possible, and review all the contents of this document before starting to use BioHPC.
How do I contact BioHPC?
You can contact BioHPC staff:
By email to email@example.com
By telephone at extension 84833
In person at our office in the NL building, room NL5.136
We strongly prefer email to firstname.lastname@example.org, as this is tracked using a ticket system, ensuring a prompt response from the member of staff best equipped to answer your question.
Introduction to the BioHPC portal
The BioHPC user portal at https://portal.biohpc.swmed.edu is the central location on the web to find information about, and gain access to all of our services. If you use BioHPC heavily you might want to set the portal as your home page, or bookmark it and visit it regularly.
The home page of the portal highlight news and upcoming training sessions. You can directly view any open support tickets, and access our cloud services. The status of the cluster and current job queue, or shown for quick reference. The content of the portal is broken into various sections in the top menu bar:
Training Calendar and Materials
BioHPC holds 4 training sessions each month, on the 1st, 2nd and 3rd Wednesday at 10am in seminar room NL6.125. A drop-in session, Coffee with BioHPC, is held on the 4th Wednesday at 10am in the same room. We have a calendar of introductory, intermediate and advanced training which operates over the year:
1st Wednesday - Introductory training for new users (repeated every month)
2nd Wednesday - Recommend intermediate topics (session repeat every 3 months
3rd Wednesday - Advanced topics
4th Wednesday - coffee with BioHPC team
Please browse our training calendar to find sessions interesting to you. At minimum we recommend all users attend the cloud storage session, and computational users attend the SLURM job scheduler session.
After each training session slides and any other material (source code examples etc.) are placed on the portal.
Portal Guides and FAQs
We’re working hard to improve the tutorial and reference material available on our website. Guides (like this one) will be added to the Guides / FAQs section of the portal as quickly as possible. When a new training session is delivered we’ll try to offer a guide as a companion and reference for the training session.
We’re interested in providing guides for other topics important to users, and improving our existing guides based on user feedback. Hit the ‘Comment on this page’ link, or email email@example.com with any suggestions.
What storage is allocated for my account?
Every BioHPC user received multiple allocations of storage, different amounts at different locations. The major storage locations on our cluster are known by the names home2, project and work reflecting their paths in the filesystem on our cluster. Standard allocations for each user are:
home2 - A 50GB quota for a home directory at /home2/<username>. This is a small area that can be used to store private configuration files, script and programs that you have installed yourself. It should not be used for storing and analyzing large datasets.
project - The /project directory is the main storage area on BioHPC. Space is allocated for each lab group, typically at least 5TB initially. Additional space is available at the request of a lab PI, depending on arrangements with your department. Your lab’s project space can be found at the path /project/<department>/<lab> . Some labs choose to have a folder for each lab member. Other labs choose to have folders for each project. All labs have a ‘shared’ folder inside their project space that can be access by anyone in the lab. Another ‘shared’ folder at /project/<department>/shared is accessible by all members of a department.
work – The /work directory is an additional storage area on different hardware than project space. It may offer better performance for some workloads. Space for each user is 5TB. It should not be used for long-term storage and inactive data should be moved to project space. Each user has a work directory at /work/<department>/<username>. A department shared directory can be found at /work/<department>/shared
Lamella - 100G storage on campus only, private cloud.
External cloud - 50G storage web-interface, external cloud, can share with researchers outside of our UT Southwestern campus.
Is my data backed up?
We currently backup data on BioHPC as follows:
About Lamella – the BioHPC cloud storage gateway
Lamella (https://lamella.biohpc.swmed.eduhttps://lamella.biohpc.swmed.edu) is our cloud storage gateway. Through lamella.biohpc.swmed.edu you can access your files via a web browser, mount your BioHPC space to your Windows PC, Mac or Linux machine, and transfer files via FTP. Lamella is the gateway system for any non-BioHPC machine to access files stored on BioHPC.
Working with storage on the web – using the lamella web interface
A full cloud-storage guide is available separately.
The lamella web interface is available at https://lamella.biohpc.swmed.edu or via the links on the BioHPC portal. Login to lamella using your BioHPC username and password. Lamella uses ownCloud, a service that provides a similar experience to web sites such as DropBox and Google Drive. You can download ownCloud client at https://portal.biohpc.swmed.edu/content/software/?.
When you first login to lamella you will be in the files view. This shows your files in lamella cloud storage, a separate 100GB allocation only available via the web or ownCloud client. To access your BioHPC project and work space you must mount them into the lamella web interface.
To mount project and work storage within the lamella web system choose the ‘Personal’ option from the user menu at the top right of the screen.
Scroll down to the ‘External Storage’ section. You can then mount your main BioHPC space by adding storage definitions as shown below. If successful you will see a green circle on the left of the storage definition, and you will find your storage via the files section of the web interface.
Your home directory and BioHPC file exchange (cloud.biohpc.swmed.edu) space are mounted by default. If you want to access your project or work space then you must add them here.Type the desired folder name, pick BioHPC Lysosome for the External storage option,and pick either log-in credential, save in session, or Username and password options for authentication.
Project Directory: Enter project in the Share box and the directory inside project (excluding the first /project) you want to access in the Remote subfolder box. E.g. to access your personal project space at /project/department/lab/s999999 you would enter department/lab/s999999 into the Remote subfolder box. To access your lab shared space you would enter department/lab/shared.
Work Directory: Enter work in the Share box and the directory inside work (excluding the first /work) you want to access in the Remote subfolder box. E.g. to access your personal work space at /work/department/s999999 you would enter department/s999999 into the Remote subfolder box. To access your department shared space you would enter department/shared.
If you choose log-in credentials, save in session you won’t be able share the folders under this directory with other people, but if you pick Username and password option and manually type in your username and password (make sure you actually type in your username and password, they might have been filled in automatically, but they won’t work unless you modified them, and the red frame around the text box disappears), this will allow you to share files and folders under this directory with other people.
(** Detailed path of your directories may vary, please refer to the activation notice email we sent to you after training.)
Mounting storage on your Windows PC or Mac
You can mount your home2, project and work space on your PC or Mac to access them directly, just like a local hard disk. These uses samba shares, often known as ‘Network Drives’ on Windows, or ‘SMB shares’ on Mac.
IMPORTANT - If you use symlinks on Linux you should be aware that they behave differently when you mount your storage to Windows or Mac. Because Windows does not have the concept of symlinks, the server follows any symlink present on Linux and provides the actual file over the drive mount, not the link. This means that if you delete a symlink (to a file or folder) from Windows/Mac drive mount it may delete the actual files, not just the link itself.
On Windows in the ‘Computer’ file browser you need to click the ‘Map Network Drive’ button on the toolbar.
Pick a drive letter which you want to map your storage as. Enter one of the following addresses to mount home2, project or work space. To mount home2 space you will replace <username> with your BioHPC username.
If you login to your PC with a username and password other than your BioHPC account then check the ‘Connect using different credentials’ box. Click ‘Finish’ and you’ll be prompted for a username and password. If the computer is not shared with others you might want to select the option to ‘Remember my credentials’ to avoid being prompted for your password each time you connect.
If your connection is successful the BioHPC space you connected to will open in an explorer window. It will also appear in ‘Computer’ as a drive. You can work with files on the mounted drive in the same way as if they were on a local hard disk. Note, however, that you must be on the campus network or connected to the UTSW VPN to obtain access.
To mount your BioHPC storage to your Mac open a finder window and then choose ‘Connect to Server’ from the ‘Go’ menu at the top of your screen. Enter one of the server addresses listed and click the ‘Connect’ button. To mount home2 space you will replace <username> with your BioHPC username.
You’ll be prompted to enter your BioHPC username and password, and have the option of saving the password to your keychain if the computer is not shared with others. Click ‘Connect’ and the BioHPC space you mounted will open in a new finder window. You can work directly with files in this space like you would on your local computer.
After a connection is made to lamella from OSX, you’ll find lamella.biohpc.swmed.edu listed in the sidebar of finder windows. For easier access to individual shares you can turn on desktop icons for the mounted drives:
Open a finder window and choose Finder->Preferences from the menu bar. Check the ‘Connected servers’ checkbox for ‘Show these items on the desktop’.
Transferring data using FTP
Using FTP for data transfer to/from BioHPC storage might be convenient if you have a very large amount of data to move or are working on the command line. FTP can be faster than Windows or Mac mounted shares, but you cannot directly work on files – you must download and upload between your computer and BioHPC.
To connect using FTP we recommend the ‘Filezilla’ client, which can be downloaded via the Software section of the portal.
Using your FTP client you will need to connect to:
Use your regular BioHPC username and password for the FTP connection.
* Previous host lysosome.biohpc.swmed.edu continues to work from computers on the campus 10Gb network only. New users should always use lamella.biohpc.swmed.edu
Introduction to the compute cluster
Our compute cluster is called Nucleus, and has 148 nodes right now. It’s a heterogeneous cluster where the nodes have different specifications. At present there’s a mix of 128, 256, 384GB nodes, plus 8 nodes with GPU cards.The cluster is running RedHat Enterprise Linux 6, and uses the SLURM job scheduling software. Not by accident this is the same basic setup as the TACC Stampede supercomputer in Austin.
To run programs on nucleus you must interact with the job scheduler, SLURM and understand how to use software modules. The job scheduler allocates time on the cluster to users, queueing their jobs and running them when free time is available on a compute node. Jobs can be submitted to the scheduler manually via the command line, or more easily using the online submission tool our web portal. Special visualization jobs can also be submitted via the portal, which allow you to connect to a graphical desktop from your local workstation, Windows PC or Mac. See below for intructions.
The modules system for software
A lot of different software is required by the groups who are members of BioHPC, and different users might need different versions etc. We use a system of ‘modules’ to provide a wide range of packages. On the cluster or clients and workstations you can load modules of software that you need. If you need additional software, or updated versions of existing software then please email us. If you are trying things out, and know how to, you can also install software into your home directory for your sole use.
Here’s an example of using software modules where we want to run the 'cufflinks' RNA-Seq analysis tool. From a command line on the compute cluster or a workstation we can run
module list to see software modules currently in-use in our current session. If we try to run the
cufflinks command it fails, because the relevant module is not loaded. To run software the system must know the path of the program, and often the location of libraries and configuration. Each module provides this information for a particular software package.
We can search for a cufflinks module with
module avail cufflinks, and then load it into our session with
module load cufflinks or
module load cufflinks/2.1.1 if we want a specific version. Now the output of
module list shows the cufflinks modules, as well as boost – a library which cufflinks depends on. We can now run the
cufflinks command directly to use the software, as below:
To use BioHPC software modules effectively familiarize yourself with the following commands, which load, unload, list, search for, and display help about modules. Remember that you can contact firstname.lastname@example.org if you are unsure about a module, which version of a module to use, or if you need additional software setup.
Show loaded modules
Show available modules
module load <module name>
Load a module, setting paths and environment in the current session
module unload <module name>
Unload a module, removing it from the current session.
module help <module name>
Help notes for a module
Help for the module command
Submitting a compute job via the portal
Once you have transferred data to BioHPC storage, the easiest way to submit a compute job is to use the Web Job Submission tool, which can be accessed from the BioHPC portal Cloud Services menu. This tool allows you to setup a job using a simple web form, automatically creating and submitting the job script to the SLURM scheduler on the cluster. Work through the form, filling in the fields according to the information below:
|Job Name||This is a name for your job, which will be visible in the output of the squeue command, and on the job list show on the BioHPC portal. Use a short but descriptive name without spaces or special characters to identify your job.|
|Modules||When running a job you must load any modules that provide software packages you require. If you are going to run a matlab script you must load a matlab module. Click the Select Modules button to access the module list. You can select any combination of modules with checkboxes. Note, however that some combinations don't make sense (e.g. 2 versions of the same package) and may cause an error on submission.|
|STDOUT file||Any output your commands would normally print to the screen will be directed to this file when your job is run on the cluster. You will find the file within your home directory, under the portal_jobs subdirectory. You can use the code '%j' to include the numeric job ID in the filename.|
|STDERR file||Any errors your commands would normally print to the screen will be directed to this file when your job is run on the cluster. You will find the file within your home directory, under the portal_jobs subdirectory. You can use the code '%j' to include the numeric job ID in the filename.|
|Partition/Queue||The nucleus cluster contains nodes with different amounts of RAM, and some with GPU cards. The cluster is organized into partitions separating these nodes. You can choose either a specific RAM partition, the GPU partition if you need a GPU card, or you can use the super partition. super is an aggregate partition containing all 128, 256 and 384GB nodes which can be used when it's not important that your job runs on any single specific type of node.|
|Number of Nodes||The number of nodes are requried for your job. Programs do not automatically run on more than one node - they must use a parallel framework such as MPI to spread work over multiple nodes. Please review the SLURM training before attempting to run jobs across multiple nodes.|
|Memory Limit (GB)||Specifies the amount of RAM your job needs. The options will depend on the partition selected. Choose the lowest amount required in order that your job can be allocated on the widest range of nodes, reducing wait times.|
|Email me||The SLURM scheduler can send you an email when your job starts running, finishes etc. You can turn these emails off if you wish.|
|Time Limit||Try to estimate the amount of time your job needs, add a margin of safety and enter that time here. The scheduler relies on job time limits to efficiently fit smaller jobs in between larger ones.Jobs with shorter time limits will generally be scheduled more quickly. Beware - this is a hard limit. If your job take longer than the limit entered it will be killed by the scheduler.|
The actual commands to run, such as 'matlab -nodisplay < hello.m' to run a matlab script, are entered in this section. You can have one or more command groups, each containing one or more commands. All of the commands in a group will be run at the same time, in parallel. The groups themselves run sequentially. Everything in the first group must finish before the second group begins. By default there is a single command in the first group, hostname. This simply prints the name of the compute node your job runs on. You can replace it with a real command, or add another command group for your own commands.
Below the web form you will see the SLURM sbatch script that is being created from your choices. It is updated whenever you make a change in the form. The script contains comments beginning with #, and parameters for the scheduler beginning with #SBATCH. Setting up jobs using the web form, and then reviewing the script that is created, is a good way to learn the basics of SLURM job scripts. Note that you can even edit the script before you submit the job, but a script that as been edited cannot be further modified using the web form.
When you are happy with the settings and commands for your job you can submit it using the button at the bottom of the page, below the script. If all settings are okay the portal will report that the job has been submitted and supply a Job ID. If there is a problem you may receive an error message from sbatch, which the portal will display. You can email email@example.com with any questions or problems you have submitting jobs.
BioHPC allows interactive use of cluster nodes, for visualization debugging and running interactive software, through the portal's Web Visualization service. Using this facility you can run a Linux desktop session on a cluster node (webGUI), on a GPU node with 3D OpenGL acceleration (webGPU), or start a powerful Windows 7 virtual machine with 3D acceleration (webWinDCV). To start a session and connect to it use the Web Visualization link in the Cloud Services menu of the portal site:
The page that is displayed lists the connection information for any running visualization sessions. At the bottom is a form allowing you to start a new session. Choose the type of session you need and click the submit button to queue the visualization job on the cluster. All visualization jobs are limited to 20 hours. They run on cluster nodes and are managed by the SLURM schedules, so it can take some time for them to start when the cluster queue is busy. Once a session has started you will see VNC/DCV connection details for your session, and a link to connect directly from your web browser. The screenshot below shows matlab running in a WebGPU session with a connection made from the web browser:
The web browser connections are convenient but a smoother response is possible by connecting using a VNC client, particularly if you are using a wired network connection on campus. Connection details are displayed for each running session. If you are not using a BioHPC client or workstation you can download the TurboVNC client using the links provided. TurboVNC is the recommended VNC client for best performance and compatibility with our sessions. To resize the Turbo VNC client, increase the resolution of the client through System->Preferences->Display menu to increase the display size of the client on the host machines.
When using a WebGPU session to run 3D visualization software you need to start programs with the vglrun command. Type vglrun in front of the command line you would usually use, to start your software. E.g. vglrun paraview. This is neccessary to ensure that the 3D images can be passed back to your VNC connection from the GPU card on the cluster GPU node.
The Windows Virtual Machine webWinDCV sessions perform best, when a specific DCV client is used to connect. This offers far better performance for 3D graphics than a standard VNC connection. A download link is provided for you on the Web Visualization page.
Command line access via the portal and SSH
If you are comfortable using the Linux command line, or want to learn, you can login to our systems using the Secure Shell (SSH) to the nucleus.biohpc.swmed.edu cluster head node. The head node of the cluster allows users to login, manipulate and edit files, compile code and submit jobs. It should not be used to run analyses, as this will affect the response for others who are using the system.
The easiest way to login to the cluster is to use the Cloud Services - Nucleus Web Terminal of the portal website. This provides a command line interface inside your web browser. You will need to enter your password when the connection is made. Note that if you close your browser, or browse to a different page your connection will close.
To login using a stand-alone SSH client, please connect to nucleus.biohpc.swmed.edu using your biohpc username and password.
If you are using a Mac or Linux computer, you can use the ssh command in a terminal window - ssh firstname.lastname@example.org
If you are using a Windows PC you will need to download an SSH client program. We recommend PuTTY, which is available via a link on the portal Software section.
Useful Linux Commands
The following commands are useful when working with Linux on BioHPC. See also the material from our Linux command line & scripting training session.
Show home directory and project directory quota and usage
panfs_quota -G /work
Show work directory quota and usage
du –sh <directory>
Show size of a specific directory and it’s contents
Show cluster job information
Show cluster node status
Submit a cluster batch job using a script file
Display a file on the screen
Displays a file so that you can scroll up and down. ‘q’ or ctrl-c quits
Powerful text editors, with a cryptic set of commands! See http://www.webmonkey.com/2010/02/vi_tutorial_for_beginners
Simpler, easier to use! See http://mintaka.sdsu.edu/reu/nano.html
Using the SLURM Job Scheduler
Earlier in this document, we described how to submit a job using the web job submission service on the BioHPC portal. You can also work with the cluster's SLURM job scheduler from the command line. If you are familiar with another job scheduler (e.g. PBS, SGE), SLURM is similar but with different names for the common commands.
A comprehensive training session on using the SLURM job scheduler is given every 3 months. Please check the BioHPC training calendar and the materials from past sessions for more information.
squeue - view the job queue - From any biohpc system you can run the squeue command to display a list of jobs currently being managed by the scheduler. The job status is shown as a 1-2 letter code, and times are in Days-Hours:Mins:Seconds format:
A more complete output, including the full status and the time-limit for each job can be obtained using the -l option to squeue:
The list of jobs on the cluster can be long at times. To see only your own jobs use the -u <username> option, e.g. squeue -u dtrudgian
sbatch - submit a job - If you are comfortable writing SLURM job scripts for your jobs you can submit them to the cluster using the sbatch command. In the sample below the script myjob.sh was created using the vi editor, then submitted using sbatch myjob.sh. The output of the sbatch command is a numeric job ID if successful, or any error message if there are problems with your job script. Once you have submitted a job you can see it in the cluster queue with the squeue command. In the example, the job is waiting to run with the reason given as (RESOURCES) as the scheduler is waiting for a node to become available:
When a node became available the job was executed successfully, and output messages written into the file specified in the job script. In this case the file was named job_26948.out and we can view it's content with the cat command:
scancel - cancelling a submitted job - If you make a mistake when submitting a job, please cancel it so that it does not occupy time on the cluster. The scancel command takes a job ID as it's only argument. In the example below we use the comand scancel 26953 to stop our job running. We can check that it was cancelled correctly, by examining the output of squeue - the job is no longer in the cluster queue. If you don't remember the job ID for a job you need to cancel check the output of squeue -u <username> which will list the details of all of your current jobs.
Now that you have worked through this introduction to BioHPC you should experiment with our systems! Make sure you can successfully login to the portal, use the web based job submission and connect to a web visualization session.
We offer training sessions on our cloud storage system, the SLURM scheduler, and our clients and workstations in a 3-month rotation on the 2nd Wednesday of each month. Advanced topics are covered on the 3rd Wednesday of each month with a drop-in coffee session held on the 4th Wednesday. Please check the training calendar and make a note of any sessions that are applicable to your work.
If you have any questions, comments, or sugesstions please contact us via email@example.com or use the 'Comment on this page' link above the menu bar.
Last updated Dec 8th, 2016, YC.