![]() |
Understanding GEM Console Output |
Disclaimer: There is no real reason for the casual user to understand in any great detail, the output displayed in the GEM Console View or the contents of the log files that ISP generates. This plug-in exists to provide a visual interface to make using and understanding ISP easier. Probably the best/easiest way to understand the output is to use the GEM Analyzer View in conjunction with the Happens Before Viewer which will display the runtime results in a graphical and easily understood manner. This page is designed to redirect casual users with this disclaimer, but to provide detailed explanations of the formatting of the output to any who are curious.
ISP - Insitu Partial Order ----------------------------------------- Command: ./any_srccandeadlock9.exe Number Procs: 3 Server: localhost:9687 Blocking Sends: Disabled FIB: Enabled ----------------------------------------- Started Process: 6687 INTERLEAVING :1 (1) is alive on laptop (0) is alive on laptop (2) is alive on laptop Started Process: 6694 (1) Finished normally (2) Finished normally (0) Finished normally INTERLEAVING :2 (0) is alive on laptop (1) is alive on laptop (2) is alive on laptop application called MPI_Abort(MPI_COMM_WORLD, 1) process 1[cli_1]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 1) process 1 application called MPI_Abort(MPI_COMM_WORLD, 1) process 0[cli_0]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 1) process 0 application called MPI_Abort(MPI_COMM_WORLD, 1) process 2[cli_2]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 1) process 2 rank 2 in job 6 laptop_35830 caused collective abort of all ranks exit status of rank 2: return code 1 rank 1 in job 6 laptop_35830 caused collective abort of all ranks exit status of rank 1: return code 1 rank 0 in job 6 laptop_35830 ; caused collective abort of all ranks exit status of rank 0: return code 1 ----------------------------------------- Transition list for 0 0 1 0 0 Barrier any_srccandeadlock9.c:36 1{[0, 1][0, 2]} {} 1 4 1 0 Irecv any_srccandeadlock9.c:47 -1 0 1{[0, 2]} {} Matched with process :2 transition :1 2 7 2 0 Recv any_srccandeadlock9.c:49 2 0{} {} Transition list for 1 0 2 0 1 Barrier any_srccandeadlock9.c:36 1{[1, 1][1, 2]} {} 1 5 1 1 Send any_srccandeadlock9.c:74 0 0{} {} 2 8 2 1 Barrier any_srccandeadlock9.c:77 2{} {} Transition list for 2 0 3 0 2 Barrier any_srccandeadlock9.c:36 1{[2, 1][2, 2]} {} 1 6 1 2 Send any_srccandeadlock9.c:62 0 0{} {} Matched with process :0 transition :1 2 9 2 2 Recv any_srccandeadlock9.c:64 0 0{} {} No matching MPI call found! Detected a DEADLOCK! Killing program any_srccandeadlock9.exe -----------------------------------------
Like ISP output, users are not expected to be able to understand log files. The best way understand what the log file contents represent is to run the Java GUI to graphically see the information it holds. The log file consists of a single number on the first line that says how many processes were used to create the file and a list of every MPI call that program issued and information about how that call interacts with other calls, unless a deadlock is found. If there is a deadlock then the log will have a line giving the interleave number and the word “DEADLOCK”. After this line the log file will abruptly end and remaining MPI calls will not be displayed. Here is the log file generated by the example above.
1 0 0 1 1 Barrier 0_0:1:2: { 1 2 } { [ 1 1 ] [ 1 2 ] [ 2 1 ] [ 2 2 ] } Match: -1 -1 File: 23 any_srccandeadlock9.c 36 1 0 1 4 5 Irecv 1 0 0_0:1:2: { 2 5 } { } Match: 1 1 File: 23 any_srccandeadlock9.c 47 1 0 2 7 7 Recv 2 0 0_0:1:2: { 3 4 } { } Match: 2 1 File: 23 any_srccandeadlock9.c 49 1 0 3 10 8 Send 2 0 0_0:1:2: { } { [ 2 3 ] [ 2 4 ] } Match: 2 2 File: 23 any_srccandeadlock9.c 51 1 0 4 11 11 Recv 1 0 0_0:1:2: { 5 } { } Match: 2 3 File: 23 any_srccandeadlock9.c 54 1 0 5 14 12 Wait { 6 } { } Match: -1 -1 File: 23 any_srccandeadlock9.c 56 1 0 6 15 13 Barrier 0_0:1:2: { 7 } { [ 1 3 ] [ 2 5 ] } Match: -1 -1 File: 23 any_srccandeadlock9.c 77 1 0 7 16 16 Finalize { } { } Match: -1 -1 File: 23 any_srccandeadlock9.c 79 1 1 0 2 2 Barrier 0_0:1:2: { 1 2 } { [ 0 1 ] [ 0 2 ] [ 2 1 ] [ 2 2 ] } Match: -1 -1 File: 23 any_srccandeadlock9.c 36 1 1 1 5 4 Send 0 0 0_0:1:2: { } { [ 0 2 ] [ 0 5 ] } Match: 0 1 File: 23 any_srccandeadlock9.c 74 1 1 2 8 14 Barrier 0_0:1:2: { 3 } { [ 0 7 ] [ 2 5 ] } Match: -1 -1 File: 23 any_srccandeadlock9.c 77 1 1 3 17 17 Finalize { } { } Match: -1 -1 File: 23 any_srccandeadlock9.c 79 1 2 0 3 3 Barrier 0_0:1:2: { 1 2 } { [ 0 1 ] [ 0 2 ] [ 1 1 ] [ 1 2 ] } Match: -1 -1 File: 23 any_srccandeadlock9.c 36 1 2 1 6 6 Send 0 0 0_0:1:2: { } { [ 0 3 ] [ 0 4 ] } Match: 0 2 File: 23 any_srccandeadlock9.c 62 1 2 2 9 9 Recv 0 0 0_0:1:2: { 3 4 } { } Match: 0 3 File: 23 any_srccandeadlock9.c 64 1 2 3 12 10 Send 0 0 0_0:1:2: { } { [ 0 5 ] } Match: 0 4 File: 23 any_srccandeadlock9.c 66 1 2 4 13 15 Barrier 0_0:1:2: { 5 } { [ 0 7 ] [ 1 3 ] } Match: -1 -1 File: 23 any_srccandeadlock9.c 77 1 2 5 18 18 Finalize { } { } Match: -1 -1 File: 23 any_srccandeadlock9.c 79 2 0 0 1 19 Barrier 0_0:1:2: { 1 2 } { } Match: -1 -1 File: 23 any_srccandeadlock9.c 36 2 0 1 4 23 Irecv 1 0 0_0:1:2: { 2 } { } Match: 2 1 File: 23 any_srccandeadlock9.c 47 2 0 2 7 7 Recv 2 0 0_0:1:2: { } { } Match: -1 -1 File: 23 any_srccandeadlock9.c 49 2 1 0 2 20 Barrier 0_0:1:2: { 1 2 } { } Match: -1 -1 File: 23 any_srccandeadlock9.c 36 2 1 1 5 4 Send 0 0 0_0:1:2: { } { } Match: -1 -1 File: 23 any_src-can-deadlock9.c 74 2 1 2 8 14 Barrier 0_0:1:2: { } { } Match: -1 -1 File: 23 any_srccandeadlock9.c 77 2 2 0 3 21 Barrier 0_0:1:2: { 1 2 } { } Match: -1 -1 File: 23 any_srccandeadlock9.c 36 2 2 1 6 22 Send 0 0 0_0:1:2: { } { } Match: 0 1 File: 23 any_srccandeadlock9.c 62 2 2 2 9 9 Recv 0 0 0_0:1:2: { } { } Match: -1 -1 File: 23 any_srccandeadlock9.c 64 2 DEADLOCK
To understand the format, we will take a single line from a log file as an example and explain each part.
1 0 0 1 1 Barrier 0_0:1:2: { 1 2 } { [ 1 1 ] [ 1 2 ] [ 2 1 ] [ 2 2 ] } Match: -1 -1 File: 23 any_src-can-deadlock9.c 36
Title | Explanation |
---|---|
1 – Interleave Number (1-based) | This was issued by the first interleaving |
0 – Process Number (0-based) | This call was issued by process zero |
0 – Process Call Index (0-based) | This was first call issued by this process |
1 – ISP Call Number (1-based) | This was the first call received by ISP |
1 – ISP Issue Number (1-based) | This was the first call performed by ISP |
Barrier – MPI Command | This call was a Barrier |
0 – Call Arguments | Varies from call to call, here it is the COM |
0:1:2 – Affected Processes | This call affects processes 1, 2, and 3 |
{1 2} – Intra-Process calls blocked | Blocks the calls from this processes with Process Call Index of 1 and 2 |
{ [ 1 1 ] [ 1 2 ] [ 2 1 ] [ 2 2 ] } - Inter-Process calls blocked |
Blocks the indicated calls found in other processes. Each are listed in pairs (numbers between the [ ] form a pair), the first element of the pair is the process where it is found, the second is the Process Call Index. So the first pair tells us that the call from process 1 with a Process Call Index of 1 is blocked. |
Match: -1 -1 – Matches | For calls like Send and Recv and has -1 -1 for calls without matches |
File: 23 any_src-can-deadlock9.c | The log file was generated by any_src-can-deadlock9.c |
36 | This call is found on line 36 |
Back to Top | Back to Table of Contents
School of Computing * 50 S. Central Campus Dr. Rm. 3190 * Salt Lake City, UT
84112 * isp-dev@cs.utah.edu
License