ctfmerge.c revision c79a74a8321729c8f50472db67e907324bace4e5
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/
/*
* Copyright 2008 Sun Microsystems, Inc. All rights reserved.
* Use is subject to license terms.
*/
/*
* Given several files containing CTF data, merge and uniquify that data into
* a single CTF section in an output file.
*
* Merges can proceed independently. As such, we perform the merges in parallel
* using a worker thread model. A given glob of CTF data (either all of the CTF
* data from a single input file, or the result of one or more merges) can only
* be involved in a single merge at any given time, so the process decreases in
* parallelism, especially towards the end, as more and more files are
* consolidated, finally resulting in a single merge of two large CTF graphs.
* Unfortunately, the last merge is also the slowest, as the two graphs being
* merged are each the product of merges of half of the input files.
*
* The algorithm consists of two phases, described in detail below. The first
* phase entails the merging of CTF data in groups of eight. The second phase
* takes the results of Phase I, and merges them two at a time. This disparity
* is due to an observation that the merge time increases at least quadratically
* with the size of the CTF data being merged. As such, merges of CTF graphs
* newly read from input files are much faster than merges of CTF graphs that
* are themselves the results of prior merges.
*
* A further complication is the need to ensure the repeatability of CTF merges.
* That is, a merge should produce the same output every time, given the same
* input. In both phases, this consistency requirement is met by imposing an
* ordering on the merge process, thus ensuring that a given set of input files
* are merged in the same order every time.
*
* Phase I
*
* The main thread reads the input files one by one, transforming the CTF
* data they contain into tdata structures. When a given file has been read
* and parsed, it is placed on the work queue for retrieval by worker threads.
*
* Central to Phase I is the Work In Progress (wip) array, which is used to
* merge batches of files in a predictable order. Files are read by the main
* thread, and are merged into wip array elements in round-robin order. When
* the number of files merged into a given array slot equals the batch size,
* the merged CTF graph in that array is added to the done slot in order by
* array slot.
*
* For example, consider a case where we have five input files, a batch size
* of two, a wip array size of two, and two worker threads (T1 and T2).
*
* 1. The wip array elements are assigned initial batch numbers 0 and 1.
* 2. T1 reads an input file from the input queue (wq_queue). This is the
* first input file, so it is placed into wip[0]. The second file is
* similarly read and placed into wip[1]. The wip array slots now contain
* one file each (wip_nmerged == 1).
* 3. T1 reads the third input file, which it merges into wip[0]. The
* number of files in wip[0] is equal to the batch size.
* 4. T2 reads the fourth input file, which it merges into wip[1]. wip[1]
* is now full too.
* 5. T2 attempts to place the contents of wip[1] on the done queue
* (wq_done_queue), but it can't, since the batch ID for wip[1] is 1.
* Batch 0 needs to be on the done queue before batch 1 can be added, so
* T2 blocks on wip[1]'s cv.
* 6. T1 attempts to place the contents of wip[0] on the done queue, and
* succeeds, updating wq_lastdonebatch to 0. It clears wip[0], and sets
* its batch ID to 2. T1 then signals wip[1]'s cv to awaken T2.
* 7. T2 wakes up, notices that wq_lastdonebatch is 0, which means that
* batch 1 can now be added. It adds wip[1] to the done queue, clears
* wip[1], and sets its batch ID to 3. It signals wip[0]'s cv, and
* restarts.
*
* The above process continues until all input files have been consumed. At
* this point, a pair of barriers are used to allow a single thread to move
* any partial batches from the wip array to the done array in batch ID order.
* When this is complete, wq_done_queue is moved to wq_queue, and Phase II
* begins.
*
* Locking Semantics (Phase I)
*
* The input queue (wq_queue) and the done queue (wq_done_queue) are
* protected by separate mutexes - wq_queue_lock and wq_done_queue. wip
* array slots are protected by their own mutexes, which must be grabbed
* before releasing the input queue lock. The wip array lock is dropped
* when the thread restarts the loop. If the array slot was full, the
* array lock will be held while the slot contents are added to the done
* queue. The done queue lock is used to protect the wip slot cv's.
*
* The pow number is protected by the queue lock. The master batch ID
* and last completed batch (wq_lastdonebatch) counters are protected *in
* Phase I* by the done queue lock.
*
* Phase II
*
* When Phase II begins, the queue consists of the merged batches from the
* first phase. Assume we have five batches:
*
* Q: a b c d e
*
* Using the same batch ID mechanism we used in Phase I, but without the wip
* array, worker threads remove two entries at a time from the beginning of
* the queue. These two entries are merged, and are added back to the tail
* of the queue, as follows:
*
* Q: a b c d e # start
* Q: c d e ab # a, b removed, merged, added to end
* Q: e ab cd # c, d removed, merged, added to end
* Q: cd eab # e, ab removed, merged, added to end
* Q: cdeab # cd, eab removed, merged, added to end
*
* When one entry remains on the queue, with no merges outstanding, Phase II
* finishes. We pre-determine the stopping point by pre-calculating the
* number of nodes that will appear on the list. In the example above, the
* number (wq_ninqueue) is 9. When ninqueue is 1, we conclude Phase II by
* signaling the main thread via wq_done_cv.
*
* Locking Semantics (Phase II)
*
* The queue (wq_queue), ninqueue, and the master batch ID and last
* completed batch counters are protected by wq_queue_lock. The done
* queue and corresponding lock are unused in Phase II as is the wip array.
*
* Uniquification
*
* We want the CTF data that goes into a given module to be as small as
* possible. For example, we don't want it to contain any type data that may
* be present in another common module. As such, after creating the master
* tdata_t for a given module, we can, if requested by the user, uniquify it
* against the tdata_t from another module (genunix in the case of the SunOS
* kernel). We perform a merge between the tdata_t for this module and the
* tdata_t from genunix. Nodes found in this module that are not present in
* genunix are added to a third tdata_t - the uniquified tdata_t.
*
* Additive Merges
*
* In some cases, for example if we are issuing a new version of a common
* module in a patch, we need to make sure that the CTF data already present
* in that module does not change. Changes to this data would void the CTF
* data in any module that uniquified against the common module. To preserve
* the existing data, we can perform what is known as an additive merge. In
* this case, a final uniquification is performed against the CTF data in the
* previous version of the module. The result will be the placement of new
* and changed data after the existing data, thus preserving the existing type
* ID space.
*
* Saving the result
*
* When the merges are complete, the resulting tdata_t is placed into the
* output file, replacing the .SUNW_ctf section (if any) already in that file.
*
* The person who changes the merging thread code in this file without updating
* this comment will not live to see the stock hit five.
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
#include <assert.h>
#include <synch.h>
#include <signal.h>
#include <libgen.h>
#include <string.h>
#include <errno.h>
#include <alloca.h>
#include "ctf_headers.h"
#include "ctftools.h"
#include "ctfmerge.h"
#include "traverse.h"
#include "memory.h"
#include "fifo.h"
#include "barrier.h"
#define MERGE_PHASE1_BATCH_SIZE 8
#define MERGE_PHASE1_MAX_SLOTS 5
#define MERGE_INPUT_THROTTLE_LEN 10
const char *progname;
static int dynsym;
int debug_level = DEBUG_LEVEL;
void
usage(void)
{
"Usage: %s [-fstv] -l label | -L labelenv -o outfile file ...\n"
" %s [-fstv] -l label | -L labelenv -o outfile -d uniqfile\n"
" %*s [-D uniqlabel] file ...\n"
" %s [-fstv] -l label | -L labelenv -o outfile -w withfile "
"file ...\n"
" %s -c srcfile destfile\n"
"\n"
" Note: if -L labelenv is specified and labelenv is not set in\n"
" the environment, a default value is used.\n",
}
static void
bigheap(void)
{
int sizes;
struct memcntl_mha mha;
/*
* First, get the available pagesizes.
*/
return;
return;
return;
sizes--;
/* set big to the largest allowed page size */
/*
* The largest page size is not a power of two for some
* inexplicable reason; return.
*/
return;
}
/*
* Now, align our break to the largest page size.
*/
return;
/*
* set the preferred page size for the heap
*/
}
static void
{
int startslot, i;
/*
* wip slots are cleared out only when maxbatchsz td's have been merged
* into them. We're not guaranteed that the number of files we're
* merging is a multiple of maxbatchsz, so there will be some partial
* groups in the wip array. Move them to the done queue in batch ID
* order, starting with the slot containing the next batch that would
* have been placed on the done queue, followed by the others.
* One thread will be doing this while the others wait at the barrier
* back in worker_thread(), so we don't need to worry about pesky things
* like locks.
*/
startslot = i;
break;
}
}
} else
}
}
}
static void
{
int num;
/*
* We're going to continually merge the first two entries on the queue,
* placing the result on the end, until there's nothing left to merge.
* At that point, everything will have been merged into one. The
* initial value of ninqueue needs to be equal to the total number of
* entries that will show up on the queue, both at the start of the
* phase and as generated by merges during the phase.
*/
while (num != 1) {
}
/*
* Move the done queue to the work queue. We won't be using the done
* queue in phase 2.
*/
}
static void
{
wq->wq_lastdonebatch++;
/* reset the slot for next use */
}
static void
{
} else {
slot->wip_nmerged++;
}
}
static void
{
int wipslotnum, pownum;
for (;;) {
/* on to phase 2 ... */
return;
}
&wq->wq_queue_lock);
}
/* there's work to be done! */
/* merge it into the right slot */
}
}
static void
{
int batchid;
for (;;) {
pthread_self());
}
return;
}
&wq->wq_queue_lock);
continue;
}
/* there's work to be done! */
/*
* merging is complete. place at the tail of the queue in
* proper order.
*/
&wq->wq_queue_lock);
}
wq->wq_ninqueue);
}
}
/*
* Main loop for worker threads.
*/
static void
{
}
}
/*
* Pass a tdata_t tree, built from an input file, off to the work queue for
* consumption by worker threads.
*/
static int
{
}
return (1);
}
/*
* This program is intended to be invoked from a Makefile, as part of the build.
* As such, in the event of a failure or user-initiated interrupt (^C), we need
* to ensure that a subsequent re-make will cause ctfmerge to be executed again.
* Unfortunately, ctfmerge will usually be invoked directly after (and as part
* of the same Makefile rule as) a link, and will operate on the linked file
* in place. If we merely exit upon receipt of a SIGINT, a subsequent make
* will notice that the *linked* file is newer than the object files, and thus
* will not reinvoke ctfmerge. The only way to ensure that a subsequent make
* reinvokes ctfmerge, is to remove the file to which we are adding CTF
* data (confusingly named the output file). This means that the link will need
* to happen again, but links are generally fast, and we can't allow the merge
* to be skipped.
*
* Another possibility would be to block SIGINT entirely - to always run to
* completion. The run time of ctfmerge can, however, be measured in minutes
* in some cases, so this is not a valid option.
*/
static void
handle_sig(int sig)
{
}
static void
terminate_cleanup(void)
{
return;
if (dounlink) {
}
}
static void
{
destfile);
}
}
static void
{
if (getenv("CTFMERGE_MAX_SLOTS"))
else
if (getenv("CTFMERGE_PHASE1_BATCH_SIZE"))
else
wq->wq_maxbatchsz);
if (getenv("CTFMERGE_INPUT_THROTTLE"))
else
wq->wq_nthreads);
wq->wq_next_batchid = 0;
for (i = 0; i < nslots; i++) {
}
wq->wq_nextpownum = 0;
wq->wq_alldone = 0;
wq->wq_nomorefiles = 0;
}
static void
{
int i;
sigemptyset(&sets);
for (i = 0; i < wq->wq_nthreads; i++) {
(void *(*)(void *))worker_thread, wq);
}
}
static void
{
int i;
for (i = 0; i < wq->wq_nthreads; i++) {
}
}
static int
{
}
/*
* Core work queue structure; passed to worker threads on thread creation
* as the main point of coordination. Allocate as a static structure; we
* could have put this into a local variable in main, but passing a pointer
* into your stack to another thread is fragile at best and leads to some
* hard-to-debug failure modes.
*/
static workqueue_t wq;
int
{
int write_fuzzy_match = 0;
int require_ctf = 0;
if (getenv("CTFMERGE_DEBUG_LEVEL"))
err = 0;
switch (c) {
case 'c':
docopy = 1;
break;
case 'd':
/* Uniquify against `uniqfile' */
break;
case 'D':
/* Uniquify against label `uniqlabel' in `uniqfile' */
break;
case 'f':
break;
case 'l':
/* Label merged types with `label' */
break;
case 'L':
/* Label merged types with getenv(`label`) */
break;
case 'o':
/* Place merged types in CTF section in `outfile' */
break;
case 't':
/* Insist *all* object files built from C have CTF */
require_ctf = 1;
break;
case 'v':
/* More debugging information */
verbose = 1;
break;
case 'w':
/* Additive merge with data from `withfile' */
break;
case 's':
/* use the dynsym rather than the symtab */
break;
default:
usage();
exit(2);
}
}
/* Validate arguments */
if (docopy) {
err++;
err++;
} else {
err++;
err++;
err++;
err++;
}
if (err) {
usage();
exit(2);
}
warning("Uniquification file %s couldn't be opened and "
"will be ignored.\n", uniqfile);
}
warning("With file %s couldn't be opened and will be "
"ignored.\n", withfile);
}
/*
* This is ugly, but we don't want to have to have a separate tool
* (yet) just for copying an ELF section with our specific requirements,
* so we shoe-horn a copier into ctfmerge.
*/
if (docopy) {
exit(0);
}
/* Sort the input files and strip out duplicates */
for (i = 0; i < nifiles; i++)
}
/* Make sure they all exist */
terminate("Some input files were inaccessible\n");
/* Prepare for the merge */
start_threads(&wq);
/*
* Start the merge
*
* We're reading everything from each of the object files, so we
* don't need to specify labels.
*/
&wq, require_ctf) == 0) {
/*
* If we're verifying that C files have CTF, it's safe to
* assume that in this case, we're building only from assembly
* inputs.
*/
if (require_ctf)
exit(0);
terminate("No ctf sections found to merge\n");
}
while (wq.wq_alldone == 0)
join_threads(&wq);
/*
* All requested files have been merged, with the resulting tree in
* mstrtd. savetd is the tree that will be placed into the output file.
*
* Regardless of whether we're doing a normal uniquification or an
* additive merge, we need a type tree that has been uniquified
* against uniqfile or withfile, as appropriate.
*
* If we're doing a uniquification, we stuff the resulting tree into
* outfile. Otherwise, we add the tree to the tree already in withfile.
*/
if (verbose || debug_level) {
}
} else
&reftd, require_ctf) == 0) {
terminate("No CTF data found in reference file %s\n",
reffile);
}
terminate("No room for additional types in master\n");
if (withfile) {
/*
* savetd holds the new data to be added to the withfile
*/
} else {
char uniqname[MAXPATHLEN];
}
} else {
/*
* No post processing. Write the merged tree as-is into the
* output file.
*/
}
return (0);
}