Cheat Sheet for various Linux/HPC Tools

Originally written for a class on High Performance Computing at NYU in the fall of 2010.

Notation

What Meaning
ALL CAPS Replace with actual value--i.e. if I say "FILE_NAME", put a file name, not FILE_NAME literally.
[optional] things in brackets can be omitted

gdb

Manual

Compiling for gdb

Compiler flag -g: Add debug info. -O may reduce the usefulness of debug info.

Starting gdb

What Meaning
gdb PROGRAM
ulimit -c unlimited enable core dumps
gdb PROGRAM core start from core dump
gdb PROGRAM PROCESS_ID attach to running process

Using gdb

Command Meaning
r run program
bt backtrace
up/down/frame N go up/down in call stack
n step over
s step into
fin return from current subroutine
Ctrl-X Ctrl-A switch to full-screen user interface
b [FILE_NAME:]LINE_NUMBER [thread N] [if COND] set break point
b FUNCTION_NAME [thread N] [if COND]

gdb with OpenMP/threads

Command Meaning
info threads show list of threads
thread N switch threads
"""]]

Advance in lock-step:


define adv4
  thread 1
  n
  thread 2
  n
  thread 3
  n
  thread 4
  n
end

gdb for MPI

Insert this snippet into your program:

{
  int i = 0;
  char hostname[256];
  gethostname(hostname, sizeof(hostname));
  printf("PID %d on %s ready for attach\n", getpid(), hostname);
  fflush(stdout);
  while (0 == i) sleep(5);
}

You might also need these headers:

#include <unistd.h>
#include <stdio.h>

Then gdb PROGRAM PID, where PID is from the output of the program. You will probably catch the program in the kernel call for sleep. Type fin until you get up to the infinite sleep loop, then say set var i = 7. Then debug as usual. You can also just execute this snippet on one misbehaving rank by adding if (rank == N) before it.

Source, more info

Valgrind

Command Meaning
valgrind PROGRAM check for heap pointer bugs
valgrind --leak-check=full PROGRAM memcheck with leak tracking
valgrind --db-attach=yes PROGRAM memcheck, ask whether to drop into gdb at each error site
valgrind --tool=cachegrind PROGRAM simulate cache behavior
kcachegrind cachegrind.out.PID view per-function profile out of cachegrind
valgrind --tool=callgrind --cache-sim=yes PROGRAM gather cache info and call graph info
kcachegrind callgrind.out.PID view per-function profile out of callgrind

Manual

GProf

What Meaning
cc -pg -o my-program my-program.c Compile with instrumentation for gprof
./my-program Run the program, writes gmon.out.
gprof ./my-pogram gmon.out [FURTHER OPTIONS] Examine profiler output

Manual