![]() |
GEM Overview |
ISP can be used by anyone who can write simple MPI C/C++ programs, and requires no special training. ISP allows you to formally verify your MPI C/C++ programs automatically without any extra efforts on your part (apart from compiling and making your examples) and flags the following errors:
In addition, it helps you understand as well as step through all relevant process interleavings (schedules). Notice our use of the word relevant: even a short MPI program may have too many (an "exponential number") of interleavings. For example, an MPI program with five processes, each containing five MPI calls, can have well in excess of 1000 interleavings. However, ISP generates a new interleaving only when it represents a truly new (as yet unexamined) behavior of your program.
ISP works by intercepting the MPI calls made by the target program and making decisions on when to send these MPI calls to the MPI library. This is accomplished by the two main components of ISP: the Profiler and the Scheduler. An overview of ISPs components and their interaction with the program as well as the MPI library is provided in Figure 1 below.
The interception of MPI calls is accomplished by compiling the ISP Profiler together with the target programs source code. The Profiler makes use of MPIs profiling mechanism (PMPI). Then the Profiler provides its own version of MPI f (replacing the original) for each corresponding MPI function f. Within each of these MPI f , the profiler communicates with the scheduler using TCP sockets to send information about the MPI call the process wants to make. It will then wait for the scheduler to make a decision on whether to send the MPI call to the MPI library or to postpone it until later. When the permission to fire f is given from the scheduler, the corresponding PMPI f will be issued to the MPI run-time. Since all MPI libraries come with functions such as PMPI f for every MPI function f, this approach provides a portable and light-weight instrumentation mechanism for MPI programs being verified [6].
The ISP Scheduler helps carry out the POE algorithm. As it turns out, the POE algorithm is based on exploiting MPIs out-of-order completion semantics. In other words,
POE stands for “POR with out-of-order and elusive interleavings"
MPI programs suffer from “elusive" interleavings in the presence of MPI
wildcard Receive R(*) . Issuing the MPI sends in different orders does not
help as the MPI runtime provides no guarantees based on when the Send is
actually issued. POE employs dynamic source re-writing in place of wildcard
so that every interleaving explored by POE algorithm is deterministic. Also,
the presence of collectives such as MPI\_Barrier
can cause traditional
runtime verification techniques such as “Dynamic Partial Order Reduction" to
fail since it is not always possible to issue instructions in different orders
in the presence of collectives. POE also addresses this issue by creating
“match sets".
POE algorithm works by creating a graph structure with the MPI operations of
processes as nodes and by adding Intra-Completes-Before (IntraCB) edges
between these nodes. As the name suggests, IntraCB edges are only added between
the MPI operations within the MPI processes. The IntraCB edges are added
between the MPI operations based on the MPI semantics of the operations. MPI
functions like MPI_Barriers
, MPI_Wait
etc have the semantics,
that no later MPI operations can be issued until these MPI operations complete.
We call such operations as fence operations. Note that instructions
issued before the fence operations can linger even after the fence operation
is complete.
The IntraCB edges are added based on the following rules: Let i and j be the MPI operations of a process P such that i < j (i.e., i comes before j in the program order. There is an IntraCB edge between i and j is one of the following is true:
If there is a path from i to j then i is called the ancestor of j. POE algorithm uses these IntraCB edges to form match-sets to be issued into the MPI runtime. The following is the (informal) algorithm:
An MPI operation is “matched" if it has been issued to the MPI runtime by the POE algorithm.
MPI_Wait
, then ⟨ i ⟩ forms a match set.
MPI_Finalize
, check if any wildcard match sets are yet to be explored. If yes, then restart the processes and goto step 1, otherwise, exit.
MPI_Finalize
, then there is a deadlock. Flag a deadlock and kill all the MPI processes.
What you will receive are the full sources as well as the following examples in the directory isp-tests:
This is a unique user interface which visually displays the output that ISP generated by highlighting lines in the source file. It shows both the current MPI call, and any matching point-to-point or collective operation. It also allows the user to examine MPI calls for a particular rank with an easy to use Rank Locking feature.
The source code analyzer also allows examination of flagged errors with one click. The Examine Error feature will highlight the problem line in the source code within the Eclipse editor. In addition, it offers features to browse calls by rank and by interleaving as well as listing any resource leaks found in the source code. One click on the flagged leak in the Resource Leak Browser takes the user to the suspect line of code in the Eclipse editor.
![]() |
Why the Trident?Because It is a universal symbol for "slaying the evil bug"! |
Back to Top | Back to Table of Contents
School of Computing * 50 S. Central Campus Dr. Rm. 3190 * Salt Lake City, UT
84112 * isp-dev@cs.utah.edu
License