Changes between Initial Version and Version 1 of KernelBpTips


Ignore:
Timestamp:
May 16, 2011, 9:46:43 AM (13 years ago)
Author:
Markus Thielen
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • KernelBpTips

    v1 v1  
     1= Kernel Debugging Tips =
     2
     3OS/2's Toolkit 3.0 contained a file named KDEBUG.INF that provides tight but very helpful information about kernel debugging.
     4
     5Following is an excerpt about setting break points:
     6
     7== Setting useful breakpoints ==
     8
     9There are many actions that can be invoked when you use a combination of breakpoint commands.  You can easily set up the debugger to stop at a certain label, print information about the current system millisecond, current process, and executing file, and then continue. For example, this would be very useful in analyzing the performance of a subsystem .DLL. Another use would be to set a breakpoint on a file-read label in the file system, and dump information about which file is being read, where it is being read from, how much of the file is being read, and what is reading it.
     10
     11In order to do this, you must know some internal information about the kernel and other system components, such as labels of various application programming interface (API) functions or worker routines.
     12
     13One way to keep some release independence is to base breakpoints from function labels with an offset.  These change much less often than the linear addresses of the functions. An example is as follows:
     14{{{
     15       BP _LDRGetMte +63 "DA %EDX L EAX +1; G"
     16}}}
     17
     18This command places a breakpoint at the symbol _LDRGetMte, offset by hex 63. Whenever that breakpoint is hit, the debugger dumps (in ASCII) the memory at the linear address stored in register EDX, for a length contained in register EAX plus one byte. The program then continues executing.
     19
     20Of course this offset within the function may change, but it is easier to find the new offset if it does than to use linear addresses for your breakpoints.
     21
     22The effect of this breakpoint is to display the name of any executable file (such as one with a file type of .EXE, .DLL, or .SYS) that is started with !DosExecPgm, !DosStartSession, or !DosLoadModule, or imported from any other executable file. This is very useful when debugging problems in applications.
     23
     24A series of actions that can help you profile your code to see how often you are taking page faults on pieces of code follows:
     25
     26 1. Unassemble at the label _ldrgetpage for approximately 30 instructions
     27
     28 2. Look for a call to _ldrvalidatemtehandle
     29
     30 3. Set a breakpoint on the instruction immediately following that call (where execution would go upon return from the call) as follows:
     31
     32{{{
     33BP% address" . lm% eax ;G "
     34}}}
     35
     36This breakpoint will cause the loader to display the name of the code module (DLL or EXE) that code is being read from. For example, all code that is ever executed is referenced at least once (the first time it is loaded). However, if you are seeing excessively long code load times and a large amount of what looks to be thrashing, you may want to use this breakpoint. Also, if you have code residing on remote drives, you may want to use this breakpoint to see how much you are hitting the code on the remote drive to possibly reconfigure it.
     37
     38A breakpoint you can use to profile your code is:
     39{{{
     40   BP xxxxxx "DD SISData+4 L1; G"
     41}}}
     42where xxxxxx is the address or label you wish to profile.
     43
     44This breakpoint will give you a running millisecond counter in hexadecimal. If you want to tell the time it takes between two points in the code, add this to the breakpoints and the time will be displayed when it hits the points. This will help in narrowing down the slow area of code. Be careful when analyzing the results, however, as although the timer is reported in milliseconds, it is only updated every 16 milliseconds.
     45
     46When the Presentation Manager Interface sets error information, you can query for it through the debugger by executing the following breakpoint:
     47{{{
     48   BP _winseterrorinfo
     49}}}
     50
     51When this breakpoint is hit, the first parameter after the return address on the stack is the error code being set.
     52
     53If you happen to press Ctrl+C while the speaker is sounding a tone, that tone will continue until the speaker is shut off. The command:
     54{{{
     55 0   61 41
     56}}}
     57will output the instruction to turn the speaker port off.
     58
     59Using the debugger, you can also see which keystrokes are being input to the Presentation Manager single input queue. The breakpoint:
     60{{{
     61 BP putinsqp "dw si 17; G"
     62}}}
     63
     64will show you the keyboard scan codes before PMWIN gets them. This is helpful when tracking the direction of keystrokes.
     65
     66You can also look at the processor's registers before they are affected by the OS/2 trap handler. By executing:
     67{{{
     68 BP trap$systemfatalfault
     69}}}
     70
     71you can stop the system before the trap handler is invoked.
     72
     73To look at the device driver header chain, you can use the following 2 commands:
     74{{{
     75 .D dev devhead
     76}}}
     77
     78This command will display the device driver information starting at the first device driver in the chain.
     79{{{
     80 .D dev 'devnext'
     81}}}
     82
     83This will query the device driver chain, showing the installed device drivers.
     84
     85If you ever need to break in just before the system reboots, use this:
     86{{{
     87 BP w_shutdown
     88}}}
     89
     90To see which slots (threads) are being dispatched, you can look inside the scheduler by:
     91{{{
     92 BP _schgetnextrunner "dw tasknumber l1; G"
     93}}}
     94
     95This will give you the slot number of each thread being dispatched. This is valuable to track performance issues to see if any one process is getting a disproportionately large share of the processor.
     96
     97To interrupt a device driver just before it initializes, you can use:
     98{{{
     99 BP syiInitializeDeviceDriver
     100}}}
     101This will stop just before the driver initializes so you can trace through the initialization routine.
     102
     103Another useful breakpoint is one that tells you what kernel calls you are executing. There is a routine called "sci" that is used in all OS/2 kernel calls. By setting the following breakpoint you can have it report what kernel calls you are making.
     104{{{
     105 BP sci "ln wo(ss:esp)"
     106}}}
     107
     108This breakpoint says to break on sci (system call interface) and list the near symbol representing the name of the kernel call. The kernel call is determined by looking at the return value on the stack. This is more effective than just stopping and listing near symbols because this method displays just the name of the routine and makes tracing a series of kernel calls much easier.
     109
     110If you need to look into the functions dealing with the graphics engine (especially if you are writing a presentation driver such as a printer or display driver) you can set the following breakpoint:
     111{{{
     112   BP dispatch32
     113}}}
     114
     115This breakpoint is in PMGRE.DLL, and when it is encountered you should look at the return address. Next, unassemble at that address less approximately 20 bytes.  You are looking for the first doubleword pushed onto the stack before the call. This should be the doubleword representing the command flags and the engine function number. The command flags are the high order word and the function number is the low order word of the doubleword. You could look at the stack frame to get this, but because this is the first dword pushed, it is the last dword in the frame due to the 32-bit calling conventions. Because the DDI stack frames are variable-sized, you may have to do some searching to find the end of the frame. You can do this too, but unassembling before the call works just as well and may be easier.
     116
     117For 16-bit presentation drivers, you would want to set the breakpoint as follows:
     118{{{
     119  BP dispatch16
     120}}}
     121
     122Since the 16-bit calling conventions put the function number first, you can simply look at the stack frame on this one.
     123
     124Another useful breakpoint is as follows:
     125{{{
     126   BP _LDRNewExe "G %(DW(SS:ESP)); DD _pgSwappablePages L 3; G"
     127}}}
     128
     129This command places a breakpoint at the symbol _LDRNewExe. When that breakpoint is reached, the program continues executing until the caller is returned to. The debugger then displays the number of pages of swappable memory in the system, followed by the number of fixed (resident) pages, followed by the number of pages of discardable memory (read-only or execute-only pages). The program then continues executing.
     130
     131The effect of this breakpoint is to display the hex number of pages that are in use by the whole system after every executable program is started.
     132
     133The term "G %(DW(SS:ESP))" means "Go (G) to the linear address (%) that is the doubleword (DW) at the top of the stack (SS:ESP) and stop". The operators "WO" and "BY" are similar to "DW". "WO" means "word," and "BY" means "byte." This type of setup can be useful when displaying parameters for a function call as well, for example:
     134{{{
     135       BP _LdrOpenNewExe "DA %(DW(SS:ESP+4)); G"
     136}}}
     137
     138When this breakpoint is executed, the debugger dumps ASCII (DA) starting at the linear address (%) given in the doubleword (DW) at SS:ESP plus 4 bytes (the doubleword at the top of the stack is the address for _LdrOpenNewExe to return to). The program then continues executing.
     139
     140The effect of this breakpoint is to display (in ASCII) the name of every executable loaded module that actually opens a file. The difference between this breakpoint and the one given earlier is that the earlier one displays the names even if the module was attached to instead of newly opened and loaded. This breakpoint only gives the names of the newly opened modules.  By setting both breakpoints and subtracting the names, you can determine the imported modules.
     141
     142A series of commands that can be useful in determining the amount of memory an application requires is as follows:
     143{{{
     144 .PU   (pick a thread that is in the process you wish to examine)
     145 .SS n (where 'n' is the thread or slot number you picked)
     146 DL    (list all valid LDT selectors (all 16-bit segments))
     147}}}
     148
     149If the DL command listing stops with an address not present or invalid, issue the ".ID x" command (where 'x' is the address) and then the G (Go) command.  When the system stops again, the LDT page will be present.  Reissue the ".SS n" and DL commands, and you will get a more complete listing of the LDT selectors.  An example of the output follows:
     150{{{
     151     # .PU
     152       ... (generates a lot of information like...)
     153Slot Pid  Ord  pPTDA    Name    pstkframe  CS:EIP        SS:ESP     cbargs
     1540012 0006 0012 7d31ccb0 FOO.EXE 7d102f50 d02f:00002501 0b3f:00007eee 0008
     155...  ...  ...  ...      ...     ...      ...           ...           ...
     156
     157     # .SS 12
     158     # DL
     159       ... (try it and stop at %7b176000...)
     160     # .ID %7b176000
     161       task|addr 0012|7b176000 queued, g to continue
     162     # G
     163       ... (stops at INT3 in tasking...)
     164       0170:fff629f4 cc             int     3
     165     # .SS 12
     166     # DL
     167       0007  Data    Bas=7b175000 Lim=0000ffff DPL=3 P  RO
     168       000f  Data    Bas=00010000 Lim=00001677 DPL=3 P  RW    A
     169       0017  Code    Bas=00020000 Lim=00003197 DPL=3 P  RE    A
     170       001f  Data    Bas=00030000 Lim=00001fff DPL=3 P  RW    A
     171       ...   ...     ...          ...          ...   .  ..    .
     172}}}
     173
     174The first entry (0007 Data ...) points to the process' LDT itself, and should not be counted when adding up the amount of memory the application itself is using.  However, it is part of the system overhead that is used for the process, and should be attributed as such.
     175
     176The following breakpoints give you a great deal of information about the startup of an application:
     177{{{
     178BP g_tkExecPgm "? 'ExecPgm'; G"
     179BP g_w_loadmodule "? 'LoadModule'; G"
     180BP h_w_QAppType "? 'QryApplType'; DA DS:DX; G"
     181BP _ldrGetMte +63 "DA %EDX ; G"
     182BP _ldrOpenNewExe "DA %(DW(SS:ESP+4)); G"
     183BP _load_error "DD SS:ESP L 3; G"
     184BP h_w_getprocaddr "? 'GetProcAddr'; G"
     185BP h_w_getprocaddr +23 "? 'by ordinal'; ? DI; G"
     186BP h_w_getprocaddr +25 "? 'by name'; DA AX:DI; G"
     187}}}
     188These nine breakpoints give you a great deal of information about how a program loads and starts executing. These breakpoints can be placed into a KDB.INI file on the system being tested to allow you to determine which binary files are loaded and executed. Please again note that these offsets may change in subsequent releases of the OS/2 kernel and loader. Some of the possible types of files are the following:
     189
     190.DLL Dynamic link library
     191.EXE Executable program
     192.FON Static font file
     193.PSF Dynamic font file
     194.QPR Queue printer driver
     195.PDR Printer driver
     196.SYS Physical device driver
     197.VDD Virtual device driver
     198
     199The next two breakpoints inform the debugger that the specified label was passed. This allows you to interpret the successive breakpoints accurately.   
     200 1. BP g_tkExecPgm     "? '!DosExecPgm'; G"
     201 2. BP g_w_loadmodule  "? '!DosLoadModule'; G"
     202
     203
     204The next breakpoint prints the name of the routine being executed, and prints the name of the file being queried.
     205 3. BP h_w_QAppType    "? 'DosQueryAppType'; DA DS:DX; G"
     206
     207A program issues !DosQueryAppType (QAT) when it has to determine the application type of an executable file. For example, when you type 'FREDDY' at a command prompt and press Enter, the system program CMD.EXE has to find out whether the file 'FREDDY' is a .EXE file, a .CMD file, a .COM file, or some other type of file.
     208
     209Some information can be derived from the file's extension. The !DOS command processor works this way. However, you may still need to determine whether a .EXE file is a DOS application, a 16-bit OS/2 Version 1.3 application, a 32-bit OS/2 application, or a  Windows application. !DosQueryAppType (QAT) attempts to determine this in a way that is very similar to what happens when the program is actually loaded and executed.
     210
     211Sometimes, !DosQueryAppType (QAT) is issued to verify that the file exists. For example, SysInit (the thread that loads all the device drivers) has to determine whether a device driver exists, and whether the file contains a valid device driver. SysInit issues !DosQueryAppType (QAT) for all of the BASEDEV= and DEVICE= statements in the CONFIG.SYS file.
     212
     213Sometimes, the QAT type does not need to open the file. If the Shell queries the type of a file several times in quick succession, the first QAT call determines the type of the file, and then saves it for a few cycles. If another QAT call is made immediately for the same file, the previously-determined type is returned. This value is cleared when any other file-system access is made, such as when a file is deleted.
     214
     215To see whether a file was actually opened for a loader or tasking call, see breakpoint number 5.
     216
     217Breakpoint number 4 prints the name of any file being loaded, read, attached to, or otherwise referred to (for example, imported) in the loader.
     218
     219 4. BP _ldrGetMte +63  "DA %EDX ; G"
     220
     221
     222If breakpoint number 1, 2, or 3 immediately precedes breakpoint number 4, then breakpoint number 4 indicates the name of the primary module being executed by !DosExecPgm, or loaded by !DosLoadModule (DLM), or queried by !DosQueryAppType (QAT).
     223
     224Breakpoint number 5 displays the name of any file being opened for execution by the loader. This breakpoint, along with breakpoint number 6, allows the debugger to find a large number of observed problems.
     225
     226 5. BP _ldrOpenNewExe   "DA %(DW(SS:ESP+4)) ; G"
     227 6. BP _load_error      "DD SS:ESP L 3; G"
     228
     229
     230Breakpoint number 6 is the error handler for this area. Many error codes are not returned through a base-pointer chained stack frame through several levels of function calls. Instead, a special stack frame is created when the loader is entered, and the stack offset to this frame is preserved. When an error occurs, _load_error() is called with two parameters,an error code and an optional MTE handle. In breakpoint number 6, the debugger displays the return address from the location the error was detected, the error code for what happened, and either a 0 or a module table entry (MTE) handle for where the error was found. File BSEERR.H contains the Control Program error codes. Common error codes are 2 (ERROR_FILE_NOT_FOUND) and 193 (ERROR_BAD_EXE_FORMAT). If an MTE handle is supplied, issue the '.lm[o] handle' command to obtain more information.
     231
     232After _load_error() finishes processing, it clears the stack of everything above the special stack frame. It then returns directly to the original caller, without executing any intervening code. This is efficient, and it avoids a large amount of error-handling code in each function.
     233
     234The next three breakpoints are used to determine what is being retrieved from a .DLL after a successful call to !DosLoadModule (DLM). Breakpoint number 7 displays the label '!DosGetProcAddr', followed by information about the .DLL being queried, including its name.
     235
     236Breakpoint number 8 displays the ordinal number of the code or data being sought in the .DLL. Breakpoint number 9 displays the name of the code or data being sought in the .DLL. Breakpoints number 8 and 9 are mutually exclusive.
     237{{{
     238 7. BP h_w_getprocaddr  "? '!DosGetProcAddr'; .LM EAX; G"
     239 8. BP h_w_getprocaddr +23 "? 'by ordinal'; ? DI; G"
     240 9. BP h_w_getprocaddr +25 "? 'by name'; DA AX:DI; G"
     241}}}
     242
     243Breakpoints number 7, 8, and 9 do not provide information that is being imported from a .DLL. To do this, more information about the internal operation of the loader is required.
     244
     245The following is an example that uses these breakpoints:
     246{{{
     247!LoadModule
     248
     249C:\OS2\DLL\TIMESNRM.PSF
     25043 3a 5c 4f 53 32 5c 44-4c 4c 5c 54 49 4d 45 53 C:\OS2\DLL\TIMESNRM.PSF.
     251!QryApplType
     252
     253C:\OS2\PMSHELL.EXE
     25443 3a 5c 4f 53 32 5c 50-4d 53 48 45 4c 4c 2e 45 C:\OS2\PMSHELL.EXE
     255!QryApplType
     256
     25704a8:00000002 C:\OS2\PMSHELL.EXE
     258!QryApplType
     259
     26004a8:00000002 C:\OS2\PMSHELL.EXE
     261!ExecPgm
     262
     263C:\OS2\PMSHELL.EXE
     264!LoadModule
     265
     266BVSCALLS
     267!GetProcAddr
     268
     269by ordinal
     270
     2710003H 3T 3Q 0000000000000011Y '.' TRUE
     272!LoadModule
     273
     274PMSDMRI
     275!LoadModule
     276
     277C:\OS2\DLL\PMATM.DLL
     278!GetProcAddr
     279
     280by name
     281
     28204a8:00000002 FONT_DRIVER_DISPATCH_TABLE
     283!LoadModule
     284
     285DISPLAY
     286!LoadModule
     287
     288C:\OS2\DLL\DISPLAY.DLL
     289!LoadModule
     290
     291PMCTLS
     292
     293!LoadModule
     294
     295SPL1B
     296!LoadModule
     297
     298SPL1D
     29953 50 4c 31 44                                  SPL1D
     3000030:00006724  fff8b6c3 00000002 00000000 (this is what _load_error() displays)
     301}}}