Removing unwanted “unaccounted-for …” waits from trace

methodr_logoAfter removing think time (or idle SQL*Net message from client) from a trace file (see a description), an unwanted line of “unaccounted for between dbcalls” dominated my MethodR profiler report. After an e-mail to MethodR support, Cary Millsap & Jeff Holt, found a way to neutralize this unwanted line. In this post I’ll show how. Thanks to Cary Millsap & Jeff Holt!

This is the profile showing the major time spent on “unaccounted-for between dbcalls”:

unaccounted

Totally 12.996 seconds are registered under the “unaccounted-for between dbcalls” subroutine. If we look at the “max” column we see that most of this comes from one certain point in the trace file.

Cary tipped me about using the mrskew and the “mrnl” report to find the actual line:

# mrskew --rc="mrnl" dummy_ora_1234567.trc

    LINE#         BEGIN-TIME           END-TIME      DURATION       UAFBC       UAFWC  TRACE-LINE
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
        1                  .                  .             .           .           .  /u01/app/oracle/admin/dummy/udump/dummy_ora_1234567.trc
        2                  .                  .             .           .           .  Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
      ...
     1688    60683300.503299    60683300.503301      0.000002           .   -0.000002  WAIT #4: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=162029 tim=60683300503301
     1689    60683300.502897    60683300.503319      0.000422    0.000016    0.000422  FETCH #4:c=0,e=422,p=0,cr=95,cu=0,mis=0,r=2,dep=0,og=1,tim=60683300503319
     1690    60683300.502897    60683300.503319             .           .           .  *** 2014-01-27T08:57:55.458487+0100
     1691    60683313.477678    60683313.879871      0.402193           .   -0.402193  WAIT #4: nam='SQL*Net message from client' ela= 402193 driver id=1413697536 #bytes=1 p3=0 obj#=162029 tim=60683313879871
     1692    60683313.477678    60683313.879871             .           .           .  =====================
     1693    60683313.477678    60683313.879871             .           .           .  PARSING IN CURSOR #24 len=52 dep=0 uid=58 oct=3 lid=58 tim=60683313879986 hv=3361021401 ad='bcabc08'
     1694    60683313.477678    60683313.879871             .           .           .  SELECT <something> FROM <a_table> WHERE name='something'
     1695    60683313.477678    60683313.879871             .           .           .  END OF STMT
     1696    60683313.879957    60683313.879984      0.000027   13.376638    0.000027  PARSE #24:c=0,e=27,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=60683313879984
     1697    60683313.880083    60683313.880085      0.000002           .   -0.000002  WAIT #24: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=162029 tim=60683313880085
     1698    60683313.880130    60683313.880650      0.000520           .   -0.000520  WAIT #24: nam='SQL*Net message from client' ela= 520 driver id=1413697536 #bytes=1 p3=0 obj#=162029 tim=60683313880650
     1699    60683313.880130    60683313.880650             .           .           .  BINDS #24:

The “unaccounted-for” time seems to come from line 1696, but actually we have to look before this. Cary & Jeff identified the gap to come from:

the line where “(tim – e) for a given call” is far greater than the tim on the prior call.

If we look at the lines above, we find line 1691 to be this line. Calculating the values:

(tim-e) for line 1691: 60683313879871-402193 = 60683313477678

Comparing the tim-value with the prior call= 60683313477678 - 60683300503319 = 12974359

Do you recognize the value? Yes. This is the 12.974 seconds we want to get rid off.

Cary & Jeff then suggested to insert a dummy line, as the new line number 1690 (since line 1690 is just a time stamp. If not the new line would be inserted as line number 1991). The “ela” of the dummy line is set to the value we want to get rid off (12974359), and the “tim” value is set to the prior tim (60683300503319 ) + the dummy ela (12974359) =  60683313477678.

The new trace file would now look like:

...
1688: WAIT #4: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=162029 tim=60683300503301
1689: FETCH #4:c=0,e=422,p=0,cr=95,cu=0,mis=0,r=2,dep=0,og=1,tim=60683300503319
1690: WAIT #0: nam='unaccounted-for between dbcalls' ela= 12974359 p1=0 p2=0 p3=0 obj#=-1 tim=60683313477678 # synthetic line 
1691: *** 2014-01-27T08:57:55.458487+0100 
1692: WAIT #4: nam='SQL*Net message from client' ela= 402193 driver id=1413697536 #bytes=1 p3=0 obj#=162029 tim=60683313879871 
1693: ===================== 
1694: PARSING IN CURSOR #24 len=52 dep=0 uid=58 oct=3 lid=58 tim=60683313879986 hv=3361021401 ad='bcabc08'
1695: SELECT <something> FROM <a_table> WHERE name='something'
1696: END OF STMT 
1697: PARSE #24:c=0,e=27,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=60683313879984 
1698: WAIT #24: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=162029 tim=60683313880085 
1699: WAIT #24: nam='SQL*Net message from client' ela= 520 driver id=1413697536 #bytes=1 p3=0 obj#=162029 tim=60683313880650 
1700: BINDS #24: ...

Then I saved the new trace file (dummy_ora_1234567_test.trc), and called the rmcallrm utility:

# mrcallrm --lines=1690 dummy_ora_1234567_test.trc > dummy_ora_1234567_new.trc

Using the MethodR Profiler the new profile now looks like this:

unaccounted_ok

Now the “unaccounted-for between dbcalls” is down to 0.021 seconds. This fits very well with the numbers from above (12.996-12.974359), with some small roundoff errors.

Another thanks to Cary Millsap & Jeff Holt!

Leave a Reply