The easiest way to check this is to get another shell on the same compute node (either by ssh’ing into it, or if it is an interactive qsub, by putting your application, e.g. R, into the background with CTRL-z) and then running either ps or top.
$ ps auxww | grep $ top
The number you are interested in using top is RES. In this case below, the YEPNEE.exe programs are each consuming ~600mb of memory.
The memory available on each compute node varies; you can run “free -m” check the total available physical memory. In the example below, this compute node has 36gb of physical memory:
$ free -m total used free shared buffers cached Mem: 36134 4396 31738 0 293 2473 -/+ buffers/cache: 1629 34505 Swap: 16383 38 16345
Alternatively, you could also run the following:
$ cat /proc/meminfo |grep MemTotal MemTotal: 37002052 kB
If you launch your program using “/usr/bin/time”, it will provide statistics about the resources used by the job. For example:
$ /usr/bin/time -v echo "test" test Command being timed: "echo test" User time (seconds): 0.00 System time (seconds): 0.00 Percent of CPU this job got: 0% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.01 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 2400 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 196 Voluntary context switches: 1 Involuntary context switches: 1 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0
Slurm keeps track of the memory footprint of a job. After the job completes, you can run sacct to get that info. Unfortunately, the default output from sacct is not very useful, and sacct -l is very verbose. We recommend setting this environment variable to customize the output:
export SACCT_FORMAT=”JobID%-20,JobName,User,Partition,NodeList,Elapsed,State,ExitCode,MaxRSS, AllocTRES%32”
You should look at the MaxRSS value.