|
|
## sacct
|
|
|
```shell
|
|
|
-a # all jobs
|
|
|
-b # brief
|
|
|
-g # specify a group to look at
|
|
|
-i # specify a node/nodes
|
|
|
-s # state of jobs PD=pending R=running CP=completed
|
|
|
```
|
|
|
example: Will show all jobs PENDING for stat-grad
|
|
|
```shell
|
|
|
sacct -a -g stat-grad -s PD
|
|
|
```
|
|
|
|
|
|
## sacctmgr
|
|
|
```shell
|
|
|
list users # will show a list of users and default accounts
|
|
|
list account # show a list of accounts/groups
|
|
|
```
|
|
|
for a list of all users with groups
|
|
|
```shell
|
|
|
sacctmgr list users format=user%-20 format=DefaultAccount%-30
|
|
|
```
|
|
|
for a single user
|
|
|
```shell
|
|
|
sacctmgr list user $(whoami) format=user%-20 format=DefaultAccount%-30
|
|
|
```
|
|
|
list all groups
|
|
|
```shell
|
|
|
sacctmgr list account format=account%-40 format=description%-40
|
|
|
```
|
|
|
show users in a given group (account)
|
|
|
```clike
|
|
|
sacctmgr show account domain-users -s format=user%-20 format=account%-30
|
|
|
```
|
|
|
Adding users
|
|
|
```shell
|
|
|
sacctmgr add user name.n account=group-name # -i flag is like assumeyes
|
|
|
```
|
|
|
removing users
|
|
|
```shell
|
|
|
sacctmgr remove user name.n account=group-name
|
|
|
```
|
|
|
#### Changing accounts and Default accounts
|
|
|
##### add the account to the user
|
|
|
```shell
|
|
|
sudo sacctmgr add user name.n account=new-group
|
|
|
```
|
|
|
##### modify the Default account
|
|
|
```shell
|
|
|
sudo sacctmgr modify user where user=name.n set defaultaccount=new-group
|
|
|
```
|
|
|
##### (optional) Remove the old entry
|
|
|
```shell
|
|
|
sacctmgr remove user where user=name.n and account=old-group
|
|
|
```
|
|
|
##### Running a job as a different account
|
|
|
```shell
|
|
|
sinteractive -p stat -A stat-users
|
|
|
```
|
|
|
- specify the partition and account and it should work.
|
|
|
|
|
|
## scontrol
|
|
|
### extending wall time
|
|
|
```shell
|
|
|
sudo scontrol update jobid=****** TimeLimit=21-00:00:00 #Days-Hours:Minutes:Seconds
|
|
|
```
|
|
|
|
|
|
## running jobs
|
|
|
### sinteractive
|
|
|
```shell
|
|
|
-p # specify partition to run on
|
|
|
```
|
|
|
|
|
|
## checking running job status
|
|
|
#### squeue
|
|
|
```shell
|
|
|
-r # array
|
|
|
-u # user
|
|
|
-A # account
|
|
|
-w # node or nodelist
|
|
|
-t # state code
|
|
|
```
|
|
|
##### [[Slurm Job State Codes]]
|
|
|
```shell
|
|
|
watch squeue -u $(whoami)
|
|
|
```
|
|
|
##### watch accepts a -n to specify interval, defaults to 2.0 seconds
|
|
|
```shell
|
|
|
watch -n 1 squeue -u $(whoami)
|
|
|
```
|
|
|
##### see currently running jobs
|
|
|
```shell
|
|
|
squeue -t R
|
|
|
```
|
|
|
|
|
|
|
|
|
#### Job Efficiency
|
|
|
```shell
|
|
|
seff $jobID
|
|
|
```
|
|
|
|
|
|
#### node health
|
|
|
##### show partition states
|
|
|
```shell
|
|
|
sinfo -p batch
|
|
|
```
|
|
|
##### show reasons why a node is a certain state
|
|
|
```shell
|
|
|
sinfo -n u141 --list-reasons
|
|
|
```
|
|
|
```shell
|
|
|
sinfo -n u141 -N -a -l
|
|
|
```
|
|
|
#### sinfo
|
|
|
```shell
|
|
|
-a # --all Display information about all partitions.
|
|
|
-n # --nodes=<nodes>
|
|
|
-p # --partition=<partition>
|
|
|
|
|
|
-N # --Node Print information in a node-oriented format
|
|
|
-l # --long Print more detailed information.
|
|
|
-s # --summarize
|
|
|
```
|
|
|
```shell
|
|
|
sinfo -n u021 -N -a -l
|
|
|
```
|
|
|
|
|
|
## global configuration of defaults
|
|
|
```clike
|
|
|
scontrol show config
|
|
|
``` |
|
|
\ No newline at end of file |