fmd_api.c revision f41150baf74bdaf964ddfe42d865d3c2380b3623
#
pragma ident "%Z%%M% %I% %E% SMI" * Table of configuration file variable types ops-vector pointers. We use this * to convert from the property description array specified by the module to an * array of fmd_conf_formal_t's. The order of this array must match the order * of #define values specified in <fmd_api.h> (i.e. FMD_TYPE_BOOL must be 0). * For now, the fmd_conf_list and fmd_conf_path types are not supported as we * do not believe modules need them and they would require more complexity. * fmd_api_vxerror() provides the engine underlying the fmd_hdl_[v]error() API * calls and the fmd_api_[v]error() utility routine defined below. The routine * formats the error, optionally associated with a particular errno code 'err', * and logs it as an ereport associated with the calling module. Depending on * other optional properties, we also emit a message to stderr and to syslog. * fmd_api_vxerror() counts as both an error of class EFMD_MODULE * as well as an instance of 'err' w.r.t. our internal bean counters. * Format the message using vsnprintf(). As usual, if the format has a * newline in it, it is printed alone; otherwise strerror() is added. err = 0;
/* err is not relevant in the message */ * Create an error event corresponding to the error, insert it into the * error log, and dispatch it to the fmd-self-diagnosis engine. * Similar to fmd_vdebug(), if the debugging switches are enabled we * echo the module name and message to stderr and/or syslog. Unlike * fmd_vdebug(), we also print to stderr if foreground mode is enabled. * We also print the message if a built-in module is aborting before * fmd has detached from its parent (e.g. default transport failure). * fmd_api_verror() is a wrapper around fmd_api_vxerror() for API subroutines. * It calls fmd_module_unlock() on behalf of its caller, logs the error, and * then aborts the API call and the surrounding module entry point by doing an * fmd_module_abort(), which longjmps to the place where we entered the module. * Common code for fmd_api_module_lock() and fmd_api_transport_impl(). This * code verifies that the handle is valid and associated with a proper thread. * If our TSD is not present at all, this is either a serious bug or * someone has created a thread behind our back and is using fmd's API. * We can't call fmd_api_error() because we can't be sure that we can * unwind our state back to an enclosing fmd_module_dispatch(), so we * must panic instead. This is likely a module design or coding error. "client handle %p from unknown thread\n", (
void *)
hdl);
* If our TSD refers to the root module and is a door server thread, * then it was created asynchronously at the request of a module but * is using now the module API as an auxiliary module thread. We reset * tp->thr_mod to the module handle so it can act as a module thread. "client handle %p is not valid\n", (
void *)
hdl);
"module has experienced an unrecoverable error\n");
* fmd_api_module_lock() is used as a wrapper around fmd_module_lock() and a * common prologue to each fmd_api.c routine. It verifies that the handle is * valid and owned by the current server thread, locks the handle, and then * verifies that the caller is performing an operation on a registered handle. * If any tests fail, the entire API call is aborted by fmd_api_error(). "client handle %p has not been registered\n", (
void *)
hdl);
* Utility function for API entry points that accept fmd_case_t's. We cast cp * to fmd_case_impl_t and check to make sure the case is owned by the caller. "case %p is invalid or not owned by caller\n", (
void *)
cip);
* Utility function for API entry points that accept fmd_xprt_t's. We cast xp * to fmd_transport_t and check to make sure the case is owned by the caller. * Note that we could make this check safer by actually walking mp's transport * list, but that requires holding the module lock and this routine needs to be * MT-hot w.r.t. auxiliary module threads. Ultimately any loadable module can * cause us to crash anyway, so we optimize for scalability over safety here. "xprt %p is invalid or not owned by caller\n", (
void *)
xp);
* fmd_hdl_register() is the one function which cannot use fmd_api_error() to * report errors, because that routine causes the module to abort. Failure to * register is instead handled by having fmd_hdl_register() return an error to * the _fmd_init() function and then detecting no registration when it returns. * So we use this routine for fmd_hdl_register() error paths instead. /* empty function for use with unspecified module entry points */ * First perform some sanity checks on our input. The API version must * be supported by FMD and the handle can only be registered once by * the module thread to which we assigned this client handle. The info * provided for the handle must be valid and have the minimal settings. * Copy the module's ops vector into a local variable to account for * changes in the module ABI. Then if any of the optional entry points * are NULL, set them to nop so we don't have to check before calling. * Make two passes through the property array to initialize the formals * to use for processing the module's .conf file. In the first pass, * we validate the types and count the number of properties. In the * second pass we copy the strings and fill in the appropriate ops. "property %s uses invalid type %u\n",
* If this module came from an on-disk file, compute the name of the * corresponding .conf file and parse properties from it if it exists. * Look up the list of the libdiagcode dictionaries associated with the * module. If none were specified, use the value from daemon's config. * We only fail if the module specified an explicit dictionary. * Make a copy of the handle information and store it in mod_info. We * do not need to bother copying fmdi_props since they're already read. * Allocate an FMRI representing this module. We'll use this later * if the module decides to publish any events (e.g. list.suspects). * Any subscriptions specified in the conf file are now stored in the * corresponding property. Add all of these to the dispatch queue. * Unlock the module and restore any pre-existing module checkpoint. * If the checkpoint is missing or corrupt, we just keep going. * If an auxiliary thread exists for the specified module at unregistration * time, send it an asynchronous cancellation to force it to exit and then * join with it (we expect this to either succeed quickly or return ESRCH). * Once this is complete we can destroy the associated fmd_thread_t data. * If any transports are still open, they have send threads that are * using the module handle: shut them down and join with these threads. * If any auxiliary threads exist, they may be using our module handle, * and therefore could cause a fault as soon as we start destroying it. * Module writers should clean up any threads before unregistering: we * forcibly cancel any remaining auxiliary threads before proceeding. * Delete any cases associated with the module (UNSOLVED, SOLVED, or * CLOSE_WAIT) as if fmdo_close() has finished processing them. * Update the dictionary property in order to preserve the list of * pathnames and expand any % tokens in the path. Then retrieve the * new dictionary names from cpa_argv[] and open them one at a time. "failed to open dictionary %s for module %s",
"topo handle: %p\n", (
void *)
thp);
"bytes exceeds module memory limit (%llu)\n",
"property %s is not of int32 type\n",
name);
"property %s is not defined\n",
name);
"property %s is not of int64 type\n",
name);
"property %s is not defined\n",
name);
"property %s is not of string type\n",
name);
"property %s is not defined\n",
name);
"invalid flags 0x%x passed to fmd_stat_create\n",
flags);
"case is already solved or closed\n",
cip->
ci_uuid);
"case is already solved or closed\n",
cip->
ci_uuid);
"failed to add events from serd engine '%s'",
name);
"%s: case is already solved or closed\n",
cip->
ci_uuid);
"%s: suspect event is missing a class\n",
cip->
ci_uuid);
* This function returns TRUE IFF all the ereports relating to a case is from a * PCI/PCIe device. If true, the rc_detector variable will be returned in DEV "[PCIE] Getting detector failed \n");
/* Find the RC PATH, this only works for dev scheme ereports */ "[PCIE] Could not get full RC path %s\n",
rcpath);
* Path comparison function used in fmd_case_pci_undiagnosable * Populates an unknown pci defect/fault with a list of suspects. This is * temporary code until a generic way to do this for all "UNDIAG FAULTS" /* Only get dev scheme paths */ "[PCIE UNDIAG] Path is NULL");
for (i = 0; i <
tbl_sz; i++) {
* Utility function for fmd_buf_* routines. If a case is specified, use the * case's ci_bufs hash; otherwise use the module's global mod_bufs hash. "cannot create '%s': buffer already exists\n",
name);
"associated with %s\n",
name,
cp ?
"case" :
"module");
"write to buf '%s' overflows buf size (%lu > %lu)\n",
"failed to create serd engine '%s': %s\n",
"serd engine '%s' does not exist\n",
name);
"failed to add record to serd engine '%s'",
name);
"serd engine '%s' does not exist\n",
name);
"serd engine '%s' does not exist\n",
name);
"auxiliary thread exceeds module thread limit (%u)\n",
"failed to create auxiliary thread");
"destroy itself (tid %u)\n",
tid);
"destroy an invalid thread (tid %u)\n",
tid);
* Wait for the specified thread to exit and then join with it. Since * the thread may need to make API calls in order to complete its work * we must sleep with the module lock unheld, and then reacquire it. * Since pthread_join() was called without the module lock held, if * multiple callers attempted to destroy the same auxiliary thread * simultaneously, one will succeed and the others will get ESRCH. * Therefore we silently ignore ESRCH but only allow the caller who * succeessfully joined with the auxiliary thread to destroy it. "failed to join with auxiliary thread %u\n",
tid);
"timer delta %lld is not a valid interval\n",
delta);
"failed to install timer +%lld",
delta);
"id %ld is not a valid timer id\n",
id);
* If the timer has not fired (t != NULL), remove it from the timer * queue. If the timer has fired (t == NULL), we could be in one of * two situations: a) we are processing the timer callback or b) * the timer event is on the module queue awaiting dispatch. For a), * fmd_timerq_remove() will wait for the timer callback function * to complete and queue an event for dispatch. For a) and b), * we cancel the outstanding timer event from the module's dispatch * Try to find the location label for this resource "invalid nvlist %p\n", (
void *)
nvl);
"invalid nvlist %p\n", (
void *)
nvl);
"fmd_nvl_fmri_present\n");
"invalid nvlist %p\n", (
void *)
nvl);
"fmd_nvl_fmri_unusable\n");
"invalid nvlist %p\n", (
void *)
nvl);
"invalid nvlist(s): %p, %p\n", (
void *)
n1, (
void *)
n2);
"fmd_nvl_fmri_contains\n");
"invalid nvlist(s): %p, %p\n", (
void *)
fmri, (
void *)
auth);
"NULL parameter specified to fmd_event_local\n");
"invalid transport flags 0x%x\n",
flags);
"cannot open write-only transport\n");
"transport exceeds module transport limit (%u)\n",
* Although this could be supported, it doesn't seem necessary or worth * the trouble. For now, just detect this and trigger a module abort. * If it is needed, transports should grow reference counts and a new * event type will need to be enqueued for the main thread to reap it. "fmd_xprt_close() cannot be called from fmdo_send()\n");
* fmd_xprt_recv() must block during startup waiting for fmd to globally * clear FMD_XPRT_DSUSPENDED. As such, we can't allow it to be called * from a module's _fmd_init() routine, because that would block * fmd from completing initial module loading, resulting in a deadlock. "fmd_xprt_post() cannot be called from _fmd_init()\n");
* Translate all FMRIs in the specified name-value pair list for the specified * FMRI authority, and return a new name-value pair list for the translation. * This function is the recursive engine used by fmd_xprt_translate(), below. * Count up the number of name-value pairs in 'nvl' and compute the * maximum length of a name used in this list for use below. * Store a snapshot of the name-value pairs in 'nvl' into nvps[] so * that we can iterate over the original pairs in the loop below while * performing arbitrary insert and delete operations on 'nvl' itself. * Now iterate over the snapshot of the name-value pairs. If we find a * value that is of type NVLIST or NVLIST_ARRAY, we translate that * object by either calling ourself recursively on it, or calling into * fmd_fmri_translate() if the object is an FMRI. We then rip out the * original name-value pair and replace it with the translated one. continue;
/* array is zero-sized; skip it */ * If the first array nvlist element looks like an FMRI * then assume the other elements are FMRIs as well. * If any b[j]'s can't be translated, then EINVAL will * be returned from nvlist_add_nvlist_array() below. "no authority defined for transport %p\n", (
void *)
xp);