README.dem revision 7c478bd95313f5f23a4c958a745db2134aa03244
#
# CDDL HEADER START
#
# The contents of this file are subject to the terms of the
# Common Development and Distribution License, Version 1.0 only
# (the "License"). You may not use this file except in compliance
# with the License.
#
# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
# or http://www.opensolaris.org/os/licensing.
# See the License for the specific language governing permissions
# and limitations under the License.
#
# When distributing Covered Code, include this CDDL HEADER in each
# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
# If applicable, add the following below this CDDL HEADER, with the
# fields enclosed by brackets "[]" replaced with your own identifying
# information: Portions Copyright [yyyy] [name of copyright owner]
#
# CDDL HEADER END
#
/*ident "@(#)cls4:tools/demangler/README.dem 1.3" */
#ident "%Z%%M% %I% %E% SMI"
###############################################################################
#
# C++ source for the C++ Language System, Release 3.0. This product
# is a new release of the original cfront developed in the computer
# science research center of AT&T Bell Laboratories.
#
# Copyright (c) 1991 AT&T and UNIX System Laboratories, Inc.
# Copyright (c) 1984, 1989, 1990 AT&T. All Rights Reserved.
#
###############################################################################
Demangler Release Notes for 3.0.1
January 1992
INTRODUCTION
This is the first release of the new C++ external name demangler. It
was rewritten for a couple of reasons, notably the desire to separate
the unmangling and printing functions and the need to fully support
template names and nested class names.
In what follows it is assumed that the reader is familiar with page
122 and following of the Annotated C++ Reference Manual, where the
name mangling scheme is given.
INSTALLATION AND CUSTOMIZATION
The demangler has two files, dem.h and dem.c. dem.h is #included in
user programs. dem.c can be compiled either for a standalone program:
$ cc -DDEM_MAIN dem.c -o dem
or to produce an object to be linked with an application:
$ cc -c dem.c
...
$ cc appl.o dem.o
The standard main() driver program in dem.c will read from standard
input or from a set of command-line files, and unmangle names in
place, while passing all whitespace and invalid names through
unchanged.
There are a couple of customization options. CLIP_UNDER should be set
(#defined) at the top of dem.c if external names on your system have a
leading underscore added to them. This is common, for example on the
Sun SPARCstation. An #if has been added for the most common machine
types.
dem.c uses its own space management system both for speed and to avoid
having to call malloc(). SP_ALIGN is the alignment of blocks that are
allocated; 4 bytes is the default. If your system requires alignment
on other than a power-of-2 boundary, you will have to adjust the
function gs() to use the mod operator rather than the mask technique
that is used by default.
There is a constant MAXDBUF defined in dem.h. This is the maximum
size of buffer required for an unmangled name's data structure. The
4096 size is the worst-case setting for a maximum input mangled name
length of 256. If you routinely have names longer than this, you
should adjust MAXDBUF upwards accordingly.
Finally, there is a compile-time value EXPLAIN that can be set. If
turned on with -DEXPLAIN, the demanger will append a brief description
of the type of each input name to the demangled output.
USE OF THE STANDALONE DEMANGLER
The standalone program reads from standard input or from one or more
files and processes each line in turn. All whitespace and
non-alphanumeric characters are passed through unchanged. All other
characters are separated into names and given to the demangler. If
the demangler fails, the name is printed unchanged. Otherwise, the
unmangled name is printed.
The standalone demangler has an exit status of 0 unless one or more
input files could not be opened for reading; in that case the exit
status is the number of unopenable files.
USE OF THE DEMANGLER LIBRARY
The basic idea is to call the dem() function:
int ret;
char inbuf[1024];
DEM d;
char sbuf[MAXDBUF];
ret = dem(inbuf, &d, sbuf);
inbuf is the input name, d the data structure that dem() fills up, and
sbuf the internal buffer that the demangler writes into to allocate
its data structure.
dem() returns -1 on error, otherwise 0.
dem.h has comments describing each field in the data structures. The
data structures are somewhat complicated by the need to handle nested
types and function arguments which themselves are function pointers
with their own arguments.
To format this data structure, there are several functions:
dem_print - format a complete name
dem_printcl - format a class name
dem_printarg - format a function argument
dem_printarglist - format a function argument list
dem_printfunc - format a function name
dem_explain - return a string explaining a type
demangle - demangle in (char*) to out (char*)
AN EXAMPLE APPLICATION
This particular application reads from standard input and displays the
class name for each mangled name read, or "(none)" on errors and C
functions/data.
#include <stdio.h>
#include "dem.h"
main()
{
char sbuf[MAXDBUF];
DEM d;
int ret;
char buf[1024];
char buf2[1024];
while (gets(buf) != NULL) {
ret = dem(buf, &d, sbuf);
if (ret || d.cl == NULL) {
printf("%s --> (none)\n", buf);
}
else {
dem_printcl(d.cl, buf2);
printf("%s --> %s\n", buf, buf2);
}
}
}
TYPENAMES
The demangler handles mangled class typenames, whether they are
simple, nested, or template classes. For example:
A__pt__2_i --> A<int>
__Q2_1A1B --> A::B
LOCAL VARIABLES
The demangler also handles local variables of the form:
__nnnxxx
For example:
__2x --> x
USE OF THE OLD C++FILT PROGRAM
Many of the previously supported options to c++filt are no longer
supported. If you need to use them, you can build the old c++filt
using the source in the subdirectory "osrc". Note that we are
planning to eliminate them in a future release, so please send mail to
your C++ support person explaining what option you needed and why.
KNOWN BUGS AND AMBIGUITIES
1. "signed" and "volatile" encodings are not handled.
2. The encoding for nested classes as mentioned on page 123 of the
ARM is handled slightly differently in cfront; there is a "_" after
the digit after the "Q".
3. A nested class starting with "Q" sometimes has the length encoded
before it; the demangler handles either case.
4. The "Tnn" and "Nnnn" notations mentioned on page 124 are not fully
supported. It is assumed that the number of the designated argument
is less than or equal to 9. So if you have 11 or more arguments, and
you want to repeat argument 10 or greater, the demangler will reject
the encoded name.
5. All literal arguments to templates are assumed to be const. For
example, the non-const literal value "37" is encoded as "Ci".
6. Some compilers will add a gratuitous "_" before external names.
7. The grammar allows class names up to 999 characters. This is
considered important for handling templates.
THE GRAMMAR FOR EXTERNAL NAMES
start --> name
######################### COMPLETE NAMES #########################
name --> sti | std | ptbl | func | data | vtbl | cname3 | local
sti --> "__sti" "__" id
std --> "__std" "__" id
ptbl --> "__ptbl_vec" "__" id
func --> "__op" arg funcpost | id funcpost
funcpost --> "__" funcpost2 | "__" cname funcpost2
funcpost2 --> csv "F" arglist
csv --> "" | "C" | "S" | "V"
data --> id | id "__" cname
vtbl --> "__vtbl" "__" cname
local --> "__" num regid
######################### CLASS NAMES #########################
cname --> cname2 | nest
nest --> "Q" digit "_" cnamelist
cnamelist --> cname2 | cnamelist cname2
cname2 --> cnlen cnid
cname3 --> cnid | "__" nest
cnlen --> digit | digit digit | digit digit digit
cnid --> id | id "__pt__" cnlen "_" arglist
######################### ARGUMENT LISTS #########################
arglist --> arg | arglist arg
arg --> modlist arg2 | "X" modlist arg2 lit
modlist --> mod | modlist mod
mod --> "" | "U" | "C" | "V" | "S" | "P" | "R" | arr | mptr
arr --> "A" num "_"
mptr --> "M" cname
arg2 --> fund | cname | funcp | repeat1 | repeat2
fund --> "v" | "c" | "s" | "i" | "l" | "f" | "d" | "r" | "e"
funcp --> "F" arglist "_" arg
repeat1 --> "T" digit | "T" digit digit
repeat2 --> "N" digit digit | "N" digit digit digit
lit --> litnum | zero | litmptr | cnlen id | sptr
litnum --> "L" digit lnum | "L" digit digit "_" lnum
litmptr --> "LM" num "_" litnum "_" cnlen id
lnum --> num | "n" num
sptr --> cnlen id "__" cname
zero --> 0
######################### LOW LEVEL STUFF #########################
digit --> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
id --> special | regid
special --> "__ct" | "__pp" # etc.
regid --> letter | letter restid
restid --> letter | digit | restid letter | restid digit
letter --> "A"-"Z" | "a" - "z" | "_"
num --> digit | num digit