Serial (OpenMP only)
Using the serial version limits you to running this job on one node (12 cores at Oakley). Requesting more threads than 12 on Oakley will be counterproductive.
- Go into your work directory with the job you want to run.
$ cd /path/to/warp/job
- Create the PBS batch script
job.pbs
in your work directory given as below.
#PBS -N any_name_for_your_job
#PBS -l walltime=01:00:00
#PBS -l nodes=1:ppn=12
#PBS -j oe
#PBS -q @oak-batch.osc.edu
#PBS -S /bin/bash
module load intel/14.0.0.080
cd $PBS_O_WORKDIR
~/bin/warp3d.omp 12 < input.wrp > output_file
Modify walltime=hh:mm:ss
with an overestimate of the time needed to run the job. The 12
is the number of threads warp3d.omp
will be using (in this case all the cores on a single node on Oakley). Also don't forget to modify input.wrp
to be the correct input file read in by Warp3D
- Run the batch job
$ qsub job.pbs
- Check on the status of the batch job
$ qstat -u <username>
Parallel (OpenMP + MPI)
Before running in parallel you will need to decide how many MPI ranks you will be using for the job. This number is usually specified as the number of domains in the geometry.
This will in turn constrain the number of threads you can do per MPI rank on Oakley (12 cores per node)...
((ranks * threads) % 12 == 0) && (12 % threads == 0)
- The first statement gives you an integer number of total nodes.
- The second statement gives an integer number of MPI processes per node.
- Both statements need to be true for your job to run.
For example, if there are 4 domains in the geometry file (input.wrp) then you will be using 4 MPI ranks.
MPI Ranks = 4
- 3 threads per rank (1 node, 4 processes per node)
- 6 threads per rank (2 nodes, 2 processes per node) = example below
- 12 threads per rank (4 nodes, 1 process per node)
It is best to choose the number of threads per rank that mirrors the size of the domains.
Example:
- Go into your work directory with the job you want to run.
$ cd /path/to/warp/job
- Create the PBS batch script
job.pbs
in your work directory given as below.
#PBS -N any_name_for_your_job
#PBS -l walltime=01:00:00
#PBS -l nodes=2:ppn=12
#PBS -j oe
#PBS -q @oak-batch.osc.edu
#PBS -S /bin/bash
module load intel/14.0.0.080
cd $PBS_O_WORKDIR
# Uses 4 domains (MPI ranks = 4) with 6 threads per rank
export OMP_NUM_THREADS=6
export NUM_PERHOST=$((12/$OMP_NUM_THREADS))
# Need to be disabled in MPI+OpenMP so that all threads of a process
# will not be scheduled to run on the same CPU/core
export MV2_ENABLE_AFFINITY=0
echo "*input from 'input.wrp'" | mpiexec -npernode $NUM_PERHOST ~/bin/warp3d.mpi_omp > output_file
Note, mpiexec
requires you to feed in a small command from stdin
instead of a large file (see below)
-
...
-
...
Issues
- Doug has mentioned he won't fix this as it is easier to just work around it. The work around is shown in the batch script.