libdwarf
|
Your thoughts on the document?
A) Are the section and subsection titles on Main Page meaningful to you?
B) Are the titles on the Modules page meaningful to you?
Anything else you find misleading or confusing? Send suggestions to ( libdwarf-list (at) prevanders with final characters .org ) Sorry about the simple obfuscation to keep bots away. It's actually a simple email address, not a list.
Thanks in advance for any suggestions.
This document describes an interface to libdwarf, a library of functions to provide access to DWARF debugging information records, DWARF line number information, DWARF address range and global names information, weak names information, DWARF frame description information, DWARF static function names, DWARF static variables, and DWARF type information. In addition the library provides access to several object sections (created by compiler writers and for debuggers) related to debugging but not mentioned in any DWARF standard.
The DWARF Standard has long mentioned the "Unix International Programming Languages Special Interest Group" (PLSIG), under whose auspices the DWARF committee was formed around 1991. "Unix International" was disbanded in the 1990s and no longer exists.
The DWARF committee published DWARF2 July 27, 1993, DWARF3 in 2005, DWARF4 in 2010, and DWARF5 in 2017.
In the mid 1990s this document and the library it describes (which the committee never endorsed, having decided not to endorse or approve any particular library interface) was made available on the internet by Silicon Graphics, Inc.
In 2005 the DWARF committee began an affiliation with FreeStandards.org. In 2007 FreeStandards.org merged with The Linux Foundation. The DWARF committee dropped its affiliation with FreeStandards.org in 2007 and established the dwarfstd.org website.
Libdwarf can safely open multiple Dwarf_Debug pointers simultaneously but all such Dwarf_Debug pointers must be opened within the same thread. And all libdwarf calls must be made from within that single (same) thread.
Essentially every libdwarf call could involve dealing with an error (possibly data corruption in the object file). Here we explain the two main approaches the library provides (though we think only one of them is truly appropriate except in toy programs). In all cases where the library returns an error code (almost every library function does) the caller should check whether the returned integer is DW_DLV_OK, DW_DLV_ERROR, or DW_DLV_NO_ENTRY and then act accordingly.
A) The recommended approach is to define a Dwarf_Error and initialize it to 0.
Then, in every call where there is a Dwarf_Error argument pass its address. For example:
The possible return values to res are, in general:
If DW_DLV_ERROR is returned then error is set (by the library) to a pointer to important details about the error and the library will not pass back any data through other pointer arguments. If DW_DLV_NO_ENTRY is returned the error argument is ignored by the library and the library will not pass back any data through pointer arguments. If DW_DLV_OK is returned argument pointers that are defined as ways to return data to your code are used and values are set in your data by the library.
Some functions cannot possibly return some of these three values. As defined later for each function.
B) An alternative (not recommended) approach is to pass NULL to the error argument.
If your initialization provided an 'errhand' function pointer argument (see below) the library will call errhand if an error is encountered. (Your errhand function could exit if you so choose.)
The the library will then return DW_DLV_ERROR, though you will have no way to identify what the error was. Could be a malloc fail or data corruption or an invalid argument to the call, or something else.
That is the whole picture. The library never calls exit() under any circumstances.
Each initialization call (for example)
has two arguments that appear nowhere else in the library.
For the recommended A) approach:
Just pass NULL to both those arguments. If the initialization call returns DW_DLV_ERROR you should then call
to free the Dwarf_Error data because dwarf_finish() does not clean up a dwarf-init error. This works even though dbg will be NULL.
For the not recommended B) approach:
Because dw_errarg is a general pointer one could create a struct with data of interest and use a pointer to the struct as the dw_errarg. Or one could use an integer or NULL, it just depends what you want to do in the Dwarf_Handler function you write.
If you wish to provide a dw_errhand, define a function (this first example is not a good choice as it terminates the application!).
and pass bad_dw_errhandler (as a function pointer, no parentheses).
The Dwarf_Ptr argument your error handler function receives is the value you passed in as dw_errarg, and can be anything, it allows you to associate the callback with a particular dwarf_init* call if you wish to make such an association.
By doing an exit() you guarantee that your application abruptly stops. This is only acceptable in toy or practice programs.
A better dw_errhand function is
because it returns rather than exiting. It is not ideal. The DW_DLV_ERROR code is returned from libdwarf and your code can do what it likes with the error situation. The library will continue from the error and will return an error code on returning to your @elibdwarf call ... but the calling function will not know what the error was.
If you do not wish to provide a dw_errhand, just pass both arguments as NULL.
So let us examine a simple case where anything could happen. We are taking the recommended A) method of using a non-null Dwarf_Error*:
When res == DW_DLV_OK newdie is a valid pointer and when appropriate we should do dwarf_dealloc_die(newdie). For other libdwarf calls the meaning depends on the function called, so read the description of the function you called for more information.
When res == DW_DLV_NO_ENTRY then newdie is not set and there is no error. It means die was the last of a siblinglist. For other libdwarf calls the meaning depends on the function called, so read the description of the function you called for more information.
When res == DW_DLV_ERROR Something bad happened. The only way to know what happened is to examine the *error as in
or both and report that somehow.
The above three values are the only returns possible from the great majority of libdwarf functions, and for these functions the return type is always int .
If it is a decently large or long-running program then you want to free any local memory you allocated and return res. If it is a small or experimental program print something and exit (possibly leaking memory).
If you want to discard the error report from the dwarf_siblingof_c() call then possibly do
Except in a special case involving function dwarf_set_de_alloc_flag() (which you will not usually call), any dwarf_dealloc() that is needed will happen automatically when you call dwarf_finish().
Very long running library access programs using relevant appropriate dwarf_dealloc calls should consider calling dwarf_set_de_alloc_flag(0). Using this one could get a performance enhancement of perhaps five percent in libdwarf CPU time and a reduction in memory use.
Be sure to test using valgrind or -fsanitize to ensure your code really does the extra dwarf_dealloc calls needed since when using dwarf_set_de_alloc_flag(0) dwarf_finish() does only limited cleanup.
The library is designed to run a single pass through the set of Compilation Units (CUs), via a sequence of calls to dwarf_next_cu_header_e(). (dwarf_next_cu_header_d() is supported but its use requires that it be immediately followed by a call to dwarf_siblingof_b(). see dwarf_next_cu_header_d(). )
Within a CU opened with dwarf_next_cu_header_e() do something (if desired) on the CU_DIE returned, and call dwarf_child() on the CU_DIE to begin recursing through all DIEs. If you save the CU_DIE you can repeat passes beginning with dwarf_child() on the CU_DIE, though it almost certainly faster to remember, in your data structures, what you need from the first pass.
The general plan:
For an example (best approach)
Line Table Registers
Please refer to the DWARF5 Standard for details. The line table registers are named in Section 6.2.2 State Machine Registers and are not much changed from DWARF2.
Certain functions on Dwarf_Line data return values for these 'registers' as these are the data available for debuggers and other tools to relate a code address to a source file name and possibly also to a line number and column-number within the source file.
DWARF defines (in each version of DWARF) sections which have a somewhat special character. These are referenced from compilation units and other places and the Standard does not forbid blocks of random bytes at the start or end or between the areas referenced from elsewhere.
Sometimes compilers (or linkers) leave trash behind as a result of optimizations. If there is a lot of space wasted that way it is quality of implementation issue. But usually the wasted space, if any, is small.
Compiler writers or others may be interested in looking at these sections independently so libdwarf provides functions that allow reading the sections without reference to what references them.
Abbreviations can be read independently
Strings can be read independently
String Offsets can be read independently
The addr table can be read independently
Those functions allow starting at byte 0 of the section and provide a length so you can calculate the next section offset to call or refer to.
Usually that works fine. If there is some random data somewhere outside of referenced areas or the data format is a gcc extension of an early DWARF version the reader function may fail, returning DW_DLV_ERROR. Such an error is neither a compiler bug nor a libdwarf bug.
In dealing with .debug_frame or .eh_frame there are five values that must be set unless one has relatively few registers in the target ABI (anything under 188 registers, see dwarf.h DW_FRAME_LAST_REG_NUM for this default).
The requirements stem from the design of the section. See the DWARF5 Standard for details. The .debug_frame section is basically the same from DWARF2 on. The .eh_frame section is similar to .debug_frame but is intended to support exception handling and has fields and data not present in .debug_frame.
Keep in mind that register values correspond to columns in the theoretical fully complete line table of a row per pc and a column per register.
There is no time or space penalty in setting Undefined_Value, Same_Value, and CFA_Column much larger than the Table_Size.
Here are the five values.
Table_Size: This sets the number of columns in the theoretical table. It starts at DW_FRAME_LAST_REG_NUM which defaults to 188. This is the only value you might need to change, given the defaults of the others are set reasonably large by default.
Undefined_Value: A register number that means the register value is undefined. For example due to a call clobbering the register. DW_FRAME_UNDEFINED_VAL defaults to 12288. There no such column in the table.
Same_Value: A register number that means the register value is the same as the value at the call. Nothing can have clobbered it. DW_FRAME_SAME_VAL defaults to 12289. There no such column in the table.
Initial_Value: The value must be either DW_FRAME_UNDEFINED_VAL or DW_FRAME_SAME_VAL to represent how most registers are to be thought of at a function call. This is a property of the ABI and instruction set. Specific frame instructions in the CIE or FDE will override this for registers not matching this value.
CFA_Column: A number for the CFA. Defined so we can use a register number to refer to it. DW_FRAME_CFA_COL defaults to 12290. There no such column in the table. See libdwarf.h struct Dwarf_Regtable3_s member rt3_cfa_rule or function dwarf_get_fde_info_for_cfa_reg3_b() or function dwarf_get_fde_info_for_cfa_reg3_c() .
A set of functions allow these to be changed at runtime. The set should be called (if needed) immediately after initializing a Dwarf_Debug and before any other calls on that Dwarf_Debug. If just one value (for example, Table_Size) needs altering, then just call that single function.
For the library accessing frame data to work properly there are certain invariants that must be true once the set of functions have been called.
REQUIRED:
Each section consists of a header for a specific compilation unit (CU) followed by an a set of tuples, each tuple consisting of an offset of a compilation unit followed by a null-terminated namestring. The tuple set is ended by a 0,0 pair. Then followed with the data for the next CU and so on.
The function set provided for each such section allows one to print all the section data as it literally appears in the section (with headers and tuples) or to treat it as a single array with CU data columns.
Each has a set of 6 functions.
These sections are accessed calling dwarf_globals_by_type() using type of DW_GL_GLOBALS or DW_GL_PUBTYPES. Or call dwarf_get_pubtypes().
The following four were defined in SGI/IRIX compilers in the 1990s but were never part of the DWARF standard. These sections are accessed calling dwarf_globals_by_type() using type of DW_GL_FUNCS,DW_GL_TYPES,DW_GL_VARS, or DW_GL_WEAKS.
It not likely you will encounter these four sections.
This most commonly happens with just-in-time compilation, and someone working on the code wants do debug this on-the-fly code in a situation where nothing can be written to disc, but DWARF can be constructed in memory.
For a simple example of this
But the libdwarf feature can be used in a wide variety of ways.
For example, the DWARF data could be kept in simple files of bytes on the internet. Or on the local net. Or if files can be written locally each section could be kept in a simple stream of bytes in the local file system.
Another example is a non-standard file system, or file format, with the intent of obfuscating the file or the DWARF.
For this to work the code generator must generate standard DWARF.
Overall the idea is a simple one: You write a small handful of functions and supply function pointers and code implementing the functions. These are part of your application or library, not part of libdwarf.
You set up a little bit of data with that code (all described below) and then you have essentially written the dwarf_init_path equivalent and you can access compilation units, line tables etc and the standard libdwarf function calls work.
Data you need to create involves these types. What follows describes how to fill them in and how to make them work for you.
Dwarf_Obj_Access_Section_a: Your implementation of a om_get_section_info must fill in a few fields for libdwarf. The fields here are standard Elf, but for most you can just use the value zero. We assume here you will not be doing relocations at runtime.
as_name: Here you set a section name via the pointer. The section names must be names as defined in the DWARF standard, so if such do not appear in your data you have to create the strings yourself.
as_type: Fill in zero.
as_flags: Fill in zero.
as_addr: Fill in the address, in local memory, where the bytes of the section are.
as_offset: Fill in zero.
as_size: Fill in the size, in bytes, of the section you are telling libdwarf about.
as_link: Fill in zero.
as_info: Fill in zero.
as_addralign: Fill in zero.
as_entrysize: Fill in one(1).
Dwarf_Obj_Access_Methods_a_s: The functions we need to access object data from libdwarf are declared here.
In these function pointer declarations 'void *obj' is intended to be a pointer (the object field in Dwarf_Obj_Access_Interface_s) that hides the library-specific and object-specific data that makes it possible to handle multiple object formats and multiple libraries. It is not required that one handles multiple such in a single libdwarf archive/shared-library (but not ruled out either). See dwarf_elf_object_access_internals_t and dwarf_elf_access.c for an example.
Usually the struct Dwarf_Obj_Access_Methods_a_s is statically defined and the function pointers are set at compile time.
The om_get_filesize member is new September 4, 2021. Its position is NOT at the end of the list. The member names all now have om_ prefix.
A typical executable or shared object is unlikely to have any section groups, and in that case what follows is irrelevant and unimportant.
COMDAT groups are defined by the Elf ABI and enable compilers and linkers to work together to eliminate blocks of duplicate DWARF and duplicate CODE.
Split Dwarf (sometimes referred to as Debug Fission) allows compilers and linkers to separate large amounts of DWARF from the executable, shrinking disk space needed in the executable while allowing full debugging (also applies to shared objects).
See the DWARF5 Standard, Section E.1 Using Compilation Units page 364.
To name COMDAT groups (defined later here) we add the following defines to libdwarf.h (the DWARF standard does not specify how to do any of this).
The DW_GROUPNUMBER_ are used in libdwarf functions dwarf_init_path(), dwarf_init_path_dl() and dwarf_init_b(). In all those cases unless you know there is any complexity in your object file, pass in DW_GROUPNUMBER_ANY.
To see section groups usage, see the example source:
The function interface declarations:
If an object file has multiple groups libdwarf will not reveal contents of more than the single requested group with a given dwarf_init_path() call. One must pass in another groupnumber to another dwarf_init_path(), meaning initialize a new Dwarf_Debug, to get libdwarf to access that group.
When opening a Dwarf_Debug the following applies:
If DW_GROUPNUMBER_ANY is passed in libdwarf will choose either of DW_GROUPNUMBER_BASE(1) or DW_GROUPNUMBER_DWO (2) depending on the object content. If both groups one and two are in the object libdwarf will chose DW_GROUPNUMBER_BASE.
If DW_GROUPNUMBER_BASE is passed in libdwarf will choose it if non-split DWARF is in the object, else the init call will return DW_DLV_NO_ENTRY.
If DW_GROUPNUMBER_DWO is passed in libdwarf will choose it if .dwo sections are in the object, else the init will call return DW_DLV_NO_ENTRY.
If a groupnumber greater than two is passed in libdwarf accepts it, whether any sections corresponding to that groupnumber exist or not. If the groupnumber is not an actual group the init will call return DW_DLV_NO_ENTRY.
For information on groups "dwarfdump -i" on an object file will show all section group information unless the object file is a simple standard object with no .dwo sections and no COMDAT groups (in which case the output will be silent on groups). Look for Section Groups data in the dwarfdump output. The groups information will be appearing very early in the dwarfdump output.
Sections that are part of an Elf COMDAT GROUP are assigned a group number > 2. There can be many such COMDAT groups in an object file (but none in an executable or shared object). Each such COMDAT group will have a small set of sections in it and each section in such a group will be assigned the same group number by libdwarf.
Sections that are in a .dwp .dwo object file are assigned to DW_GROUPNUMBER_DWO,
Sections not part of a .dwp package file or a.dwo section, or a COMDAT group are assigned DW_GROUPNUMBER_BASE.
At least one compiler relies on relocations to identify COMDAT groups, but the compiler authors do not publicly document how this works so we ignore such (these COMDAT groups will result in libdwarf returning DW_DLV_ERROR).
Popular compilers and tools are using such sections. There is no detailed documentation that we can find (so far) on how the COMDAT section groups are used, so libdwarf is based on observations of what compilers generate.
There are, at present, three distinct approaches in use to put DWARF information into separate objects to significantly shrink the size of the executable. All of them involve identifying a separate file.
Split Dwarf is one method. It defines the attribute DW_AT_dwo_name (if present) as having a file-system appropriate name of the split object with most of the DWARF.
The second is Macos dSYM. It is a convention of placing the DWARF-containing object (separate from the object containing code) in a specific subdirectory tree.
The third involves GNU debuglink and GNU debug_id. These are two distinct ways (outside of DWARF) to provide names of alternative DWARF-containing objects elsewhere in a file system.
If one initializes a Dwarf_Debug object with dwarf_init_path() or dwarf_init_path_dl() appropriately libdwarf will automatically open the alternate dSYM or debuglink/debug_id object on the object with most of the DWARF.
libdwarf provides means to automatically read the alternate object (in place of the one named in the init call) or to suppress that and read the named object file.
Case 1:
If dw_true_path_out_buffer or dw_true_path_bufferlen is passed in as zero then the library will not look for an alternative object.
Case 2:
If dw_true_path_out_buffer passes a pointer to space you provide and dw_true_path_bufferlen passes in the length, in bytes, of the buffer, libdwarf will look for alternate DWARF-containing objects. We advise that the caller zero all the bytes in dw_true_path_out_buffer before calling.
If the alternate object name (with its null-terminator) is too long to fit in the buffer the call will return DW_DLV_ERROR with dw_error providing error code DW_DLE_PATH_SIZE_TOO_SMALL.
If the alternate object name fits in the buffer libdwarf will open and use that alternate file in the returned Dwarf_Dbg.
It is up to callers to notice that dw_true_path_out_buffer now contains a string and callers will probably wish to do something with the string.
If the initial byte of dw_true_path_out_buffer is a non-null when the call returns then an alternative object was found and opened.
The second function, dwarf_init_path_dl(), is the same as dwarf_init_path() except the _dl version has three additional arguments, as follows:
Pass in NULL or dw_dl_path_array, an array of pointers to strings with alternate GNU debuglink paths you want searched. For most people, passing in NULL suffices.
Pass in dw_dl_path_array_size, the number of elements in dw_dl_path_array.
Pass in dw_dl_path_source as NULL or a pointer to char. If non-null libdwarf will set it to one of three values:
If you wish to do the basic libdwarf tests and are linking against a shared library libdwarf you must do an install for the tests to succeed (in some environments it is not strictly necessary).
For example, if building with configure, do
You can install anywhere, there is no need to install in a system directory! Creating a temporary directory and installing there suffices. If installed in appropriate system directories that works too.
When compiling to link against a shared library libdwarf you must not define LIBDWARF_STATIC.
For examples of this for all three build systems read the project shell script
To pass LIBDWARF_STATIC to the preprocessor with Visual Studio:
GNU Debuglink-specific issue:
If GNU debuglink is present and considered by dwarf_init_path() or dwarf_init_path_dl() the library may be required to compute a 32bit crc (Cyclic Redundancy Check) on the file found via GNU debuglink.
For people doing repeated builds of objects using such the crc check is a waste of time as they know the crc comparison will pass.
For such situations a special interface function lets the dwarf_init_path() or dwarf_init_path_dl() caller suppress the crc check without having any effect on anything else in libdwarf.
It might be used as follows (the same pattern applies to dwarf_init_path_dl() ) for any program that might do multiple dwarf_init_path() or dwarf_init_path_dl() calls in a single program execution.
This pattern ensures the crc check is suppressed for this single dwarf_init_path() or dwarf_init_path_dl() call while leaving the setting unchanged for further dwarf_init_path() or dwarf_init_path_dl() calls in the running program.
We list these with newest first.
Changes 0.10.1 to 0.11.0
Added function dwarf_get_ranges_baseaddress() to the api to allow dwarfdump and other library callers to easily derive the (cooked) address from the raw data in the DWARF2, DWARF3, DWARF4 .debug_ranges section. An example of use is in doc/checkexamples.c (see examplev).
Changes 0.9.2 to 0.10.1
Released 01 July 2024 (Release 0.10.0 was missing a CMakeLists.txt file and is withdrawn).
Added API function dwarf_get_locdesc_entry_e() to allow dwarfdump to report some data from .debug_loclists more completely – it reports a byte length of each loclist item. This is of little interest to anyone, surely. dwarf_get_locdesc_entry_d() is still what you should be using.
dwarf_debug_addr_table() now supports reading the DWARF4 GNU extension .debug_addr table.
A heuristic sanity check for PE object files was too conservative in limiting VirtualSize to 200MB. A library user has an exe with .debug_info size of over 200MB. Increased the limit to be 2000MB and changed the names of the errors for the three heuristic checks to include HEURISTIC so it is easier to know the kind of error/failure it is.
When doing a shared-library build with cmake we were not emitting the correct .so version names nor setting SONAME with the correct version name. This long-standing mistake is now fixed.
Changes 0.9.1 to 0.9.2
Version 0.9.2 released 2 April 2024
Vulnerabilities DW202402-001, DW202402-002,DW202402-003, and DW202403-001 could crash libdwarf given a carefully corrupted (fuzzed) DWARF object file. Now the library returns an error for these corruptions. DW_CFA_high_user (in dwarf.h) was a misspelling. Added the correct spelling DW_CFA_hi_user and a comment on the incorrect spelling.
Changes 0.9.0 to 0.9.1
Version 0.9.1 released 27 January 2024
The abbreviation code type returned by dwarf_die_abbrev_code() changed from int to Dwarf_Unsigned as abbrev codes are not constrained by the DWARF Standard.
The section count returned by dwarf_get_section_count() is now of type Dwarf_Unsigned. The previous type of int never made sense in libdwarf. Callers will, in practice, see the same value as before.
All type-warnings issued by MSVC have been fixed.
Problems reading Macho (Apple) relocatable object files have been fixed.
Each of the build systems available now has an option which eliminates libdwarf references to the object section decompression libraries. See the respective READMEs.
Changes 0.8.0 to 0.9.0
Version 0.9.0 released 8 December 2023
Adding functions (rarely needed) for callers with special requirements. Added dwarf_get_section_info_by_name_a() and dwarf_get_section_info_by_index_a() which add dw_section_flags pointer argument to return the object section file flags (whose meaning depends entirely on the object file format), and dw_section_offset pointer argument to return the object-relevant offset of the section (here too the meaning depends on the object format). Also added dwarf_machine_architecture() which returns a few top level data items about the object libdwarf has opened, including the 'machine' and 'flags' from object headers (all supported object types).
This adds new library functions dwarf_next_cu_header_e() and dwarf_siblingof_c(). Used exactly as documented dwarf_next_cu_header_d() and dwarf_siblingof_b() work fine and continue to be supported for the forseeable future. However it would be easy to misuse as the requirement that dwarf_siblingof_b() be called immediately after a successful call to dwarf_next_cu_header_d() was never stated and that dependency was impossible to enforce. The dependency was an API mistake made in 1992.
So dwarf_next_cu_header_e() now returns the compilation-unit DIE as well as header data and dwarf_siblingof_c() is not needed except to traverse sibling DIEs. (the compilation-unit DIE by definition has no siblings).
Changes were required to support Mach-O (Apple) universal binaries, which were not readable by earlier versions of the library.
We have new library functions dwarf_init_path_a(), dwarf_init_path_dl_a(), and dwarf_get_universalbinary_count().
The first two allow a caller to specify which (numbering from zero) object file to report on by adding a new argument dw_universalnumber. Passing zero as the dw_universalnumber argument is always safe.
The third lets callers retrieve the number being used.
These new calls do not replace anything so existing code will work fine.
Applying the previously existing calls dwarf_init_path() dwarf_init_path_dl() to a Mach-O universal binary works, but the library will return data on the first (index zero) as a default since there is no dw_universalnumber argument possible.
For improved performance in reading Fde data when iterating though all usable pc values we add dwarf_get_fde_info_for_all_regs3_b(), which returns the next pc value with actual frame data. We retain dwarf_get_fde_info_for_all_regs3() so existing code need not change.
Changes 0.7.0 to 0.8.0
v0.8.0 released 2023-09-20
New functions dwarf_get_fde_info_for_reg3_c(), dwarf_get_fde_info_for_cfa_reg3_c() are defined. The advantage of the new versions is they correctly type the dw_offset argument return value as Dwarf_Signed instead of the earlier and incorrect type Dwarf_Unsigned.
The original functions dwarf_get_fde_info_for_reg3_b() and dwarf_get_fde_info_for_cfa_reg3_b() continue to exist and work for compatibility with the previous release.
For all open() calls for which the O_CLOEXEC flag exists we now add that flag to the open() call.
Vulnerabilities involving reading corrupt object files (created by fuzzing) have been fixed: DW202308-001 (ossfuzz 59576), DW202307-001 (ossfuzz 60506), DW202306-011 (ossfuzz 59950), DW202306-009 (ossfuzz 59755), DW202306-006 (ossfuzz 59727), DW202306-005 (ossfuzz 59717), DW202306-004 (ossfuzz 59695), DW202306-002 (ossfuzz 59519), DW202306-001 (ossfuzz 59597). DW202305-010 (ossfuzz 59478). DW202305-009 (ossfuzz 56451). DW202305-008 (ossfuzz 56451), DW202305-007 (ossfuzz 56474), DW202305-006 (ossfuzz 56472), DW202305-005 (ossfuzz 56462), DW202305-004 (ossfuzz 56446).
Changes 0.6.0 to 0.7.0
v0.7.0 released 2023-05-20
Elf section counts can exceed 16 bits (on linux see man 5 elf) so some function prototype members of struct Dwarf_Obj_Access_Methods_a_s changed. Specifically, om_get_section_info() om_load_section(), and om_relocate_a_section() now pass section indexes as Dwarf_Unsigned instead of Dwarf_Half. Without this change executables/objects with more than 64K sections cannot be read by libdwarf. This is unlikely to affect your code since for most users libdwarf takes care of this and dwarfdump is aware of this change.
Two functions have been removed from libdwarf.h and the library: dwarf_dnames_abbrev_by_code() and dwarf_dnames_abbrev_form_by_index().
dwarf_dnames_abbrev_by_code() is slow and pointless. Use either dwarf_dnames_name() or dwarf_dnames_abbrevtable() instead, depending on what you want to accomplish.
dwarf_dnames_abbrev_form_by_index() is not needed, was difficult to call due to argument list requirements, and never worked.
Changes 0.5.0 to 0.6.0
v0.6.0 released 2023-02-20 The dealloc required by dwarf_offset_list() was wrong. The call could crash libdwarf on systems with 32bit pointers. The new and proper dealloc (for all pointer sizes) is dwarf_dealloc(dbg,offsetlistptr,DW_DLA_UARRAY);
A memory leak from dwarf_load_loclists() and dwarf_load_rnglists() is fixed and the libdwarf-regressiontests error that hid the leak has also been fixed.
A compatibility change affects callers of dwarf_dietype_offset(), which on success returns the offset of the target of the DW_AT_type attribute (if such exists in the Dwarf_Die). Added a pointer argument so the function can (when appropriate) return a FALSE argument indicating the offset refers to DWARF4 .debug_types section, rather than TRUE value when .debug_info is the section the offset refers to. If anyone was using this function it would fail badly (while pretending success) with a DWARF4 DW_FORM_ref_sig8 on a DW_AT_type attribute from the Dwarf_Die argument. One will likely encounter DWARF4 content so a single correct function seemed necessary. New regression tests will ensure this will continue to work.
A compatibility change affects callers of dwarf_get_pubtypes(). If an application reads .debug_pubtypes there is a compatibility break. Such applications must be recompiled with latest libdwarf, change Dwarf_Type declarations to use Dwarf_Global, and can only use the latest libdwarf. We are correcting a 1993 library design mistake that created extra work and documentation for library users and inflated the libdwarf API and documentation for no good reason.
The changes are: the data type Dwarf_Type disappears as do dwarf_pubtypename() dwarf_pubtype_die_offset(), dwarf_pubtype_cu_offset(), dwarf_pubtype_name_offsets() and dwarf_pubtypes_dealloc(). Instead the type is Dwarf_Global, the type and functions used for dwarf_get_globals(). The existing read/dealloc functions for Dwarf_Global apply to pubtypes data too.
No one should be referring to the 1990s SGI/IRIX sections .debug_weaknames, .debug_funcnames, .debug_varnames, or .debug_typenames as they are not emitted by any compiler except from SGI/IRIX/MIPS in that period. There is (revised) support in libdwarf to read these sections, but we will not mention details here.
Any use of DW_FORM_strx3 or DW_FORM_addrx3 in DWARF would, in 0.5.0 and earlier, result in libdwarf reporting erroneous data. A copy-paste error in libdwarf/dwarf_util.c was noticed and fixed 24 January 2023 for 0.6.0. Bug DW202301-001.
Changes 0.4.2 to 0.5.0
v0.5.0 released 2022-11-22 The handling of the .debug_abbrev data in libdwarf is now more cpu-efficient (measurably faster) so access to DIEs and attribute lists is faster. The changes are library-internal so are not visible in the API.
Corrects CU and TU indexes in the .debug_names (fast access) section to be zero-based. The code for that section was previously unusable as it did not follow the DWARF5 documentation.
dwarf_get_globals() now returns a list of Dwarf_Global names and DIE offsets whether such are defined in the .debug_names or .debug_pubnames section or both. Previously it only read .debug_pubnames.
A new function, dwarf_global_tag_number(), returns the DW_TAG of any Dwarf_Global that was derived from the .debug_names section.
Three new functions enable printing of the .debug_addr table. dwarf_debug_addr_table(), dwarf_debug_addr_by_index(), and dwarf_dealloc_debug_addr_table(). Actual use of the table(s) in .debug_addr is handled for you when an attribute invoking such is encountered (see DW_FORM_addrx, DW_FORM_addrx1 etc).
Added doc/libdwarf.dox to the distribution (left out by accident earlier).
Changes 0.4.1 to 0.4.2
0.4.2 released 2022-09-13. No API changes. No API additions. Corrected a bug in dwarf_tsearchhash.c where a delete request was accidentally assumed in all hash tree searches. It was invisible to libdwarf uses. Vulnerabilities DW202207-001 and DW202208-001 were fixed so error conditions when reading fuzzed object files can no longer crash libdwarf (the crash was possible but not certain before the fixes). In this release we believe neither libdwarf nor dwarfdump leak memory even when there are malloc failures. Any GNU debuglink or build-id section contents were not being properly freed (if malloced, meaning a compressed section) until 9 September 2022.
It is now possible to run the build sanity tests in all three build mechanisms (configure,cmake,meson) on linux, MacOS, FreeBSD, and mingw msys2 (windows). libdwarf README.md (or README) and README.cmake document how to do builds for each supported platform and build mechanism.
Changes 0.4.0 to 0.4.1
Reading a carefully corrupted DIE with form DW_FORM_ref_sig8 could result in reading memory outside any section, possibly leading to a segmentation violation or other crash. Fixed.
Reading a carefully corrupted .debug_pubnames/.debug_pubtypes could lead to reading memory outside the section being read, possibly leading to a segmentation violation or other crash. Fixed.
libdwarf accepts DW_AT_entry_pc in a compilation unit DIE as a base address for location lists (though it will prefer DW_AT_low_pc if present, per DWARF3). A particular compiler emits DW_AT_entry_pc in a DWARF2 object, requiring this change.
libdwarf adds dwarf_suppress_debuglink_crc() so that library callers can suppress crc calculations. (useful to save the time of crc when building and testing the same thing(s) over and over; it just loses a little checking.) Additionally, libdwarf now properly handles objects with only GNU debug-id or only GNU debuglink.
dwarfdump adds --show-args, an option to print its arguments and version. Without that new option the version and arguments are not shown. The output of -v (--version) is a little more complete.
dwarfdump adds --suppress-debuglink-crc, an option to avoid crc calculations when rebuilding and rerunning tests depending on GNU .note.gnu.buildid or .gnu_debuglink sections. The help text and the dwarfdump.1 man page are more specific documenting --suppress-debuglink-crc and --no-follow-debuglink
Changes 0.3.4 to 0.4.0
Removed the unused Dwarf_Error argument from dwarf_return_empty_pubnames() as the function can only return DW_DLV_OK. dwarf_xu_header_free() renamed to dwarf_dealloc_xu_header(). dwarf_gdbindex_free() renamed to dwarf_dealloc_gdbindex(). dwarf_loc_head_c_dealloc renamed to dwarf_dealloc_loc_head_c().
dwarf_get_location_op_value_d() renamed to dwarf_get_location_op_value_c(), and 3 pointless arguments removed. The dwarf_get_location_op_value_d version and the three arguments were added for DWARF5 in libdwarf-20210528 but the change was a mistake. Now reverted to the previous version.
The .debug_names section interfaces have changed. Added dwarf_dnames_offsets() to provide details of facts useful in problems reading the section. dwarf_dnames_name() now does work and the interface was changed to make it easier to use.
Changes 0.3.3 to 0.3.4
Replaced the groff -mm based libdwarf.pdf with a libdwarf.pdf generated by doxygen and latex.
Added support for the meson build system.
Updated an include in libdwarfp source files. Improved doxygen documentation of libdwarf. Now 'make check -j8' and the like works correctly. Fixed a bug where reading a PE (Windows) object could fail for certain section virtual size values. Added initializers to two uninitialized local variables in dwarfdump source so a compiler warning cannot not kill a –enable-wall build.
Added src/bin/dwarfexample/showsectiongroups.c so it is easy to see what groups are present in an object without all the other dwarfdump output.
Changes 20210528 to 0.3.3 (28 January 2022)
There were major revisions in going from date versioning to Semantic Versioning. Many functions were deleted and various functions changed their list of arguments. Many many filenames changed. Include lists were simplified. Far too much changed to list here.