A Guide to Running AMBER at SDSC
(TERAGRID IA64 CLUSTER)



This page provides an end user guide to running the AMBER molecular dynamics simulation software on the various High Performance Computing (HPC) resources available at San Diego Supercomputer Center (SDSC).

Running on SDSC's Teragrid IA64 Cluster

SDSC's TeraGrid cluster system is composed of 262 IBM Itaninum 2 (1.5GHz) nodes with two processors per each node. Each node is built with SuSe Linux and interconnected with Myricom's Myrinet network. The system has a peak performance of 3.1 Teraflops, a total memory of 1 Terabyte, and a total of 50 Terabytes GPFS disk through the fiber optic SAN network.

For the latest news on allocations, queues, resources etc please see the Teragrid Cluster Documentation.

If you have any specific questions relating to running AMBER at SDSC please contact consulting@sdsc.edu. General questions concerning AMBER should be directed to the AMBER mailing list (amber@scripps.edu).

Amber 9 Installation / Available Codes
The recommended version of AMBER to run on the Teragrid cluster is AMBER 9.

AMBER 9 is installed in /usr/local/apps/amber9

In here you will find an exe directory containing the executables and a dat directory containing the force field files. The main files that you will need to use on the Teragrid cluster are as follows:
 

Executable (aliases) Description
pmemd.MPI (pmemd) Recommended executable for running Molecular Dynamics simulations in parallel on Teragrid architecture. Supports Particle Mesh Ewald (PME) and Generalized Born (GB) simulations.
pmemd.1cpu Serial version of pmemd.MPI executable, use for single cpu runs.
sander.MPI (sander) Dynamics engine similar to PMEMD but supports many more options. If you plan to run a simulation type that is not supported by pmemd then you should use this executable (e.g. QM/MM). Note: Parallel scaling will not be as efficient as pmemd. Test the performance for your chosen simulation before submitting long jobs.
sander.1cpu Serial version of sander.MPI executable, use for single cpu runs.
sander.PIMD.MPI (sander.PIMD) Supports Path Integral MD simulations and Nudged Elastic Band Simulations. Parallel version.
sander.PIMD.1cpu Serial version of sander.PIMD.MPI executable, use for single cpu runs.
sander.LES.MPI (sander.LES) Supports Locally Enhanced Sampling (LES) MD. Parallel version.
sander.LES.1cpu Serial version of sander.LES.MPI executable, use for single cpu runs.

Other executables are present (e.g. nmode for normal mode analysis) but do not support parallel execution.

Amber 9 Performance and Scaling
The best performance and scaling will typically be obtained by using the PMEMD executable. Hence if your simulation falls under the remit of what PMEMD supports then you should use this. The scaling behaviour will very much depend on the type and size of your job. Implicit solvent GB simulations typically scale better than explicit solvent PME simulations but often have many less residues which limits the maximum number of cpus (for GB you need 1.01x more residues than processors). For PME simulations you require 4.0 x more residues than processors. For both GB and PME simulations you will generally find that the scaling improves as you go to more and more atoms. This is a function of the underlying theory.

The graph below shows the expected scaling for three PME simulations (Cellulose [408K atoms], JAC [23.5K atoms] and FactorIX [91K atoms]) and for a medium size GB simulation (gb_mb [2.49K atoms]):
 

Ps/day
Speedup
click image for larger view

As you can see from the graph all simulations have a region where the scaling is acceptable and then where it tends to tail off. Caution: Going to very large numbers of cpus can often result in your code taking longer. The exact scaling you see will depend on the size and type of job you are running so before burning to much cpu time you should test the scaling with the simulation you plan to run. Typically the optimum point on the Teragrid cluster is between 64 and 128 cpus but if your simulation is small you may need to use less cpus.

Required Environment Options

The Amber 9 installation in /usr/local/apps/amber9 was compiled with the IFORT v9.0.033 compiler and was linked against MKL8.0 and MPICH-GM-1.2.6. Since the libraries are dynamically accessed at runtime you must add the following to the .soft file in your home directory. If you don't have a .soft you should create one. (Note, it is 'dot'soft).

.soft
+intel-c-9.0.032-f-9.0.033
+mpich-gm-1.2.6-intel9032
+intel-mkl80
@teragrid

You also need to edit your .cshrc file and add: setenv AMBERHOME /usr/local/apps/amber9

To check everything is setup correctly login to one of the teragrid login nodes and execute the following commands. You should see the responses given in blue, if you get something different then you have not setup your environment correctly. Contact SDSC consulting:

>which ifort
 /usr/local/apps/intel/compiler9.0.032/bin/ifort
>which mpirun
 /usr/local/apps/mpich-gm-1.2.6-intel9032/bin/mpirun
>echo $AMBERHOME
 /usr/local/apps/amber9/

Example Job Submission Scripts
You are now ready to run the AMBER software via the queuing system. All you require are your mdin files, inpcrd/restart files and prmtop files. Note, you should copy these files to either /gpfs/mydir or /gpfs-wan/mydir and both read and write everything here. When your job is done you can then copy these over to your local machine using scp.

The following is an example job submission script for a PMEMD run (the \'s act as line continuation characters. If you want you can put all of the options on a single line):

pmemd_teragrid_8cpu.x  
Script Explanation
#SDSC Teragrid PBS Script
#PBS -j oe
#PBS -l nodes=4:ppn=2
#PBS -l walltime=0:30:00
#PBS -q dque
#PBS -V
#PBS -M myemail@myaddress.edu
#PBS -A accountcode
#PBS -N run_pmemd_8

cd /gpfs/mydir/amber_job
mpirun -v -machinefile $PBS_NODEFILE -np 8 \ /usr/local/apps/amber9/exe/pmemd \
                     -i mdinfile \
                     -o mdoutfile \
                     -c inpcrdfile \
                     -r restrtfile \
                     -p prmtopfile \
                     -x mdcrdfile
 
-j oe : Append output and error message to same file

-l nodes=4:ppn=2 : Teragrid has 2 cpus per node so ppn should always be 2. Set nodes to how many nodes you want to use. E.g. for 128 cpus use nodes=64:ppn=2

-l walltime=0:30:00 : set this to slightly longer than you think your job will take to run. Maximum is 18:00:00 = 18 hours. Smaller values can get your jobs run sooner due to backfill opportunities.

-A accountcode : make sure you replace accountcode with the account you want to be charged.

-N run_pmemd_8 : identifier name for the job, choose something sensible.

cd /gpfs/mydir/amber_job : change this to be the directory on gpfs where your input files are and where your output files should be written.

mpirun ..... -np 8 : Make sure you change the number here (8) to match the number of cpus you are requesting (nodes*ppn).

You can submit this job to the queue using qsub.

KNOWN LIMITATIONS
The targeted MD test cases segfault in parallel. If you are running targeted MD simulations you should test your system in serial and parallel before submitting large jobs.

| Return to Main Page |