|
|
* [Serial Run Instructions](#serial)
|
|
|
* [Parallel Run Instructions](#parallel)
|
|
|
|
|
|
### <a name="serial"></a>Serial (OpenMP only)
|
|
|
|
|
|
Using the serial version limits you to running this job on one node (12 cores at Oakley). Requesting more threads than 12 on Oakley will be counterproductive.
|
|
|
|
|
|
1. Go into your work directory with the job you want to run.
|
|
|
|
|
|
```bash
|
|
|
$ cd /path/to/warp/job
|
|
|
```
|
|
|
|
|
|
2. Create the PBS batch script `job.pbs` in your work directory given as below.
|
|
|
|
|
|
```bash
|
|
|
#PBS -N any_name_for_your_job
|
|
|
#PBS -l walltime=01:00:00
|
|
|
#PBS -l nodes=1:ppn=12
|
|
|
#PBS -j oe
|
|
|
#PBS -q @oak-batch.osc.edu
|
|
|
#PBS -S /bin/bash
|
|
|
|
|
|
module load intel/14.0.0.080
|
|
|
|
|
|
cd $PBS_O_WORKDIR
|
|
|
|
|
|
~/bin/warp3d.omp 12 < input.wrp > output_file
|
|
|
```
|
|
|
|
|
|
Modify `walltime=hh:mm:ss` with an overestimate of the time needed to run the job. The `12` is the number of threads `warp3d.omp` will be using (in this case all the cores on a single node on Oakley). Also don't forget to modify `input.wrp` to be the correct input file read in by Warp3D
|
|
|
|
|
|
3. Run the batch job
|
|
|
|
|
|
```bash
|
|
|
$ qsub job.pbs
|
|
|
```
|
|
|
|
|
|
4. Check on the status of the batch job
|
|
|
|
|
|
```bash
|
|
|
$ qstat -u <username>
|
|
|
```
|
|
|
|
|
|
### <a name="parallel"></a>Parallel (OpenMP + MPI)
|
|
|
|
|
|
Before running in parallel you will need to decide how many MPI ranks you will be using for the job. This number is usually specified as the number of domains in the geometry.
|
|
|
|
|
|
This will in turn constrain the number of threads you can do per MPI rank on Oakley (12 cores per node)...
|
|
|
|
|
|
```c
|
|
|
((ranks * threads) % 12 == 0) && (12 % threads == 0)
|
|
|
```
|
|
|
|
|
|
* The first statement gives you an integer number of total nodes.
|
|
|
* The second statement gives an integer number of MPI processes per node.
|
|
|
* **Both** statements need to be true for your job to run.
|
|
|
|
|
|
For example, if there are 4 **domains** in the geometry file (input.wrp) then you will be using 4 MPI ranks.
|
|
|
|
|
|
MPI Ranks = 4
|
|
|
|
|
|
* 3 threads per rank (1 node, 4 processes per node)
|
|
|
* 6 threads per rank (2 nodes, 2 processes per node) = example below
|
|
|
* 12 threads per rank (4 nodes, 1 process per node)
|
|
|
|
|
|
It is best to choose the number of threads per rank that mirrors the size of the domains.
|
|
|
|
|
|
Example:
|
|
|
|
|
|
1. Go into your work directory with the job you want to run.
|
|
|
|
|
|
```bash
|
|
|
$ cd /path/to/warp/job
|
|
|
```
|
|
|
|
|
|
2. Create the PBS batch script `job.pbs` in your work directory given as below.
|
|
|
|
|
|
```bash
|
|
|
#PBS -N any_name_for_your_job
|
|
|
#PBS -l walltime=01:00:00
|
|
|
#PBS -l nodes=2:ppn=12
|
|
|
#PBS -j oe
|
|
|
#PBS -q @oak-batch.osc.edu
|
|
|
#PBS -S /bin/bash
|
|
|
|
|
|
module load intel/14.0.0.080
|
|
|
|
|
|
cd $PBS_O_WORKDIR
|
|
|
|
|
|
# Uses 4 domains (MPI ranks = 4) with 6 threads per rank
|
|
|
export OMP_NUM_THREADS=6
|
|
|
export NUM_PERHOST=$((12/$OMP_NUM_THREADS))
|
|
|
|
|
|
# Need to be disabled in MPI+OpenMP so that all threads of a process
|
|
|
# will not be scheduled to run on the same CPU/core
|
|
|
export MV2_ENABLE_AFFINITY=0
|
|
|
|
|
|
echo "*input from 'input.wrp'" | mpiexec -npernode $NUM_PERHOST ~/bin/warp3d.mpi_omp > output_file
|
|
|
```
|
|
|
|
|
|
Note, `mpiexec` requires you to feed in a small command from `stdin` instead of a large file ([see below](#issue-bigfile))
|
|
|
|
|
|
3. ...
|
|
|
|
|
|
4. ...
|
|
|
|
|
|
### Issues
|
|
|
|
|
|
1. <a name="issue-bigfile"></a>`mpiexec` has a file size limit of exactly 377 KiB from `stdin`.
|
|
|
* Doug has mentioned he won't fix this as it is easier to just work around it. The work around is shown in the batch script. |
|
|
\ No newline at end of file |