0N/ADebugging transported core dumps
0N/A<
h1>Debugging transported core dumps</
h1>
0N/AWhen a core dump is moved to a machine different from the one where it was
0N/Aproduced ("transported core dump"), debuggers (dbx, gdb, windbg or SA) do not
0N/Aalways successfully open the dump. This is due to kernel, library (shared
0N/Aobjects or DLLs) mismatch between core dump machine and debugger machine.
0N/AIn most platforms, core dumps do not contain text (
a.k.a) Code pages.
0N/AThere pages are to be read from executable and shared objects (or DLLs).
0N/ATherefore it is important to have matching executable and shared object
0N/Afiles in debugger machine.
0N/A<
h3>Solaris transported core dumps</
h3>
0N/ADebuggers on Solaris (and Linux) use two addtional shared objects
0N/Ais used to get information on threads from the core dump.
rtld_db.so 0N/AHence, debugger machine should have right version of
rtld_db.so and
0N/Athese debugger libraries can be found in
0N/ASolaris Linkers and Libraries Guide - 817-1984</
a>
0N/A<
h3>Solaris SA against transported core dumps</
h3>
0N/AWith transported core dumps, you may get "rtld_db failures" or
0N/A"libthread_db failures" or SA may just throw some other error
0N/A(hotspot symbol is missing) when opening the core dump.
0N/AEnviroment variable <
b>LIBSAPROC_DEBUG</
b> may be set to any value
0N/Ato debug such scenarios. With this env. var set, SA prints many
0N/Amessages in standard error which can be useful for further debugging.
0N/Aprints debug messages with env. var <
b>LIBPROC_DEBUG</
b>. But,
0N/Asetting LIBSAPROC_DEBUG results in setting LIBPROC_DEBUG as well.
0N/AThe best possible way to debug a transported core dump is to match the
0N/Adebugger machine to that of core dump machine.
i.e., have same Kernel
0N/Aand libthread patch level between the machines. mdb (Solaris modular
0N/Adebugger) may be used to find the Kernel patch level of core dump
0N/Amachine and debugger machine may be brought to the same level.
0N/AIf the matching machine is "far off" in your network, then
0N/A<
li>consider using rlogin and <
a href="clhsdb.html">CLHSDB - SA command line HSDB interface</
a> or
0N/A<
li>use SA remote debugging and debug the core from core machine remotely.
0N/ABut, it may not be feasible to find matching machine to debug.
0N/AIf so, you can copy all application shared objects (and
libthread_db.so, if needed) from the core dump
0N/Amachine into your debugger machine's directory, say, /
export/
applibs. Now, set <
b>SA_ALTROOT</
b>
0N/ASupport for transported core dumps is <
b>not</
b> built into the standard version of
libproc.so. You need to
0N/Aset <
b>LD_LIBRARY_PATH</
b> env var to point to the path of a specially built version of
libproc.so. 0N/ANote that this version of
libproc.so has a special symbol to support transported core dump debugging.
0N/AIn future, we may get this feature built into standard
libproc.so -- if that happens, this step (of
0N/Asetting LD_LIBRARY_PATH) can be skipped.
0N/AIf you are okay with missing thread related information, you can set
0N/A<
b>SA_IGNORE_THREADDB</
b> environment variable to any value. With this
0N/Aset, SA ignores libthread_db failure, but you won't be able to get any
0N/Athread related information. But, you would be able to use SA and get
0N/A<
h3>Linux SA against transported core dumps</
h3>
0N/AOn Linux, SA parses core and shared library ELF files. SA <
b>does not</
b> use
0N/Amay still face problems with transported core dumps, because matching shared
0N/Aobjects may not be in the path(s) specified in core dump file. To
0N/Aworkaround this, you can define environment variable <
b>SA_ALTROOT</
b>
0N/Ato be the directory where shared libraries are kept. The semantics of
0N/Athis env. variable is same as that for Solaris (please refer above).