utilities.c revision 20a1ae8aa548e5c0874f0cb213a5f242fe315a59
1N/A * Copyright 2006 Sun Microsystems, Inc. All rights reserved. 1N/A * Use is subject to license terms. 1N/A/* Copyright (c) 1983, 1984, 1985, 1986, 1987, 1988, 1989 AT&T */ 1N/A/* All Rights Reserved */ 1N/A * Copyright (c) 1980, 1986, 1990 The Regents of the University of California. 1N/A * All rights reserved. 1N/A * Redistribution and use in source and binary forms are permitted 1N/A * provided that: (1) source distributions retain this entire copyright 1N/A * notice and comment, and (2) distributions including binaries display 1N/A * the following acknowledgement: ``This product includes software 1N/A * developed by the University of California, Berkeley and its contributors'' 1N/A * in the documentation or other materials provided with the distribution 1N/A * and in all advertising materials mentioning features or use of this 1N/A * software. Neither the name of the University nor the names of its 1N/A * contributors may be used to endorse or promote products derived 1N/A * from this software without specific prior written permission. 1N/A * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR 1N/A * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED 1N/A * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. 1N/A#
pragma ident "%Z%%M% %I% %E% SMI" 1N/A (
void)
printf(
"bad file type for acl I=%d: 0%o\n",
1N/A pfatal(
"INTERNAL ERROR: GOT TO reply() in preen mode");
1N/A * We don't know what's going on, so don't potentially 1N/A * make things worse by having errexit() write stuff 1N/A "\n%s: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.\n",
1N/A /* LINTED pointer difference won't overflow */ 1N/A * Malloc buffers and set up cache. 1N/A * We're discarding the entire chain, so this isn't 1N/A * technically necessary. However, it doesn't hurt 1N/A * and lint's data flow analysis is much happier 1N/A * (this prevents it from thinking there's a chance 1N/A * of our using memory elsewhere after it's been released). 1N/A * Manage a cache of directory blocks. 1N/A * We're at the same logical level as getblk(), so if there 1N/A * are any errors, we'll let our caller handle them. 1N/A * Move the buffer to head of linked list if it isn't 1N/A * It's not our buf, so if there are errors, let whoever 1N/A * acquired it deal with the actual problem. 1N/A * We're flushing the superblock, so make sure all the 1N/A * ancillary bits go out as well. pfatal(
"CANNOT %s: DISK BLOCK %lld: %s",
if (
reply(
"CONTINUE") == 0) {
* Were we using a backup superblock? if (
preen ||
reply(
"UPDATE STANDARD SUPERBLOCK") ==
1) {
* Note that we only count cache-related reads. * Anything that called fsck_bread() or getblk() * directly are explicitly not cached, so they're not (
void)
printf(
"cache missed %lld of %lld reads (%lld%%)\n",
* In our universe, nothing exists before the superblock, so * just pretend it's always zeros. This is the complement of * bwrite()'s ignoring write requests into that space. "WARNING: fsck_bread() passed blkno < %d (%lld)\n",
pwarn(
"THE FOLLOWING SECTORS COULD NOT BE READ:");
"WARNING: Attempt to write illegal blkno %lld on %s\n",
pwarn(
"THE FOLLOWING SECTORS COULD NOT BE WRITTEN:");
* Allocates the specified number of contiguous fragments. * It's arguable whether we should just fail, or instead * error out here. Since we should only ever be asked for * a single fragment or an entire block (i.e., sblock.fs_frag), * we'll fail out because anything else means somebody * changed code without considering all of the ramifications. errexit(
"allocblk() asked for %d frags. " "Legal range is 1 to %d",
* For each filesystem block, look at every possible starting * offset within the block such that we can get the number of * contiguous fragments that we need. This is a drastically * simplified version of the kernel's mapsearch() and alloc*(). * It's also correspondingly slower. * Is first fragment of candidate run available? * Are the rest of them available? * No, skip the known-unusable run. * Found what we need, so claim them. "allocblk: selected %d (in block %d), frags %d, size %d\n",
* Free a previously allocated block (
void)
printf(
"debug: freeing %d fragments starting at %d\n",
* Nothing in the return status has any relevance to how * we're using pass4check(), so just ignore it. * Fill NAMEBUF with a path starting in CURDIR for INO. Assumes * that the given buffer is at least MAXPATHLEN + 1 characters. (
void)
printf(
"debug: getpathname(curdir %d, ino %d)\n",
* In the case of extended attributes, our * parent won't necessarily be a directory, so just * return what we've found with a prefix indicating * that it's an XATTR. Presumably our caller will * know what's going on and do something useful, like * work out the path of the parent and then combine * Can't use strcpy(), etc, because we've probably * already got some name information in the buffer and * the usual trailing \0 would lose it. * If curdir == ino, need to get a handle on .. so we * can search it for ino's name. Otherwise, just search * the given directory for ino. Repeat until out of space * or a full path has been built. * To get this far, id_parent must have the inode * number for `..' in it. By definition, that's got * to be a directory, so search it for the inode of * Prepend to what we've accumulated so far. If * there's not enough room for even one more path element * (of the worst-case length), then bail out. * Corner case for a looped-to-itself directory. * Climb one level of the hierarchy. In other words, * the current .. becomes the inode to search for and * its parent becomes the directory to search in. * If we hit a discontinuity in the hierarchy, indicate it by * prefixing the path so far with `?'. Otherwise, the first * character will be `/' as a side-effect of the *--cp above. * The special case is to handle the situation where we're * trying to look something up in UFSROOTINO, but didn't find * The invariants being used for buffer integrity are: * - namebuf[] is terminated with \0 before anything else * - cp is always <= the last element of namebuf[] * - the new path element is always stored at the * beginning of namebuf[], and is no more than MAXNAMLEN-1 * - cp is is decremented by the number of characters in * - if, after the above accounting for the new element's * size, there is no longer enough room at the beginning of * namebuf[] for a full-sized path element and a slash, * terminate the loop. cp is in the range * &namebuf[0]..&namebuf[MAXNAMLEN - 1] /* LINTED per the above discussion */ * When preening, allow a single quit to signal * a special exit after filesystem checks complete * so that reboot sequence may be interrupted. (
void)
printf(
"returning to single-user after filesystem check\n");
* determine whether an inode should be fixed. if (
reply(
"SALVAGE") == 0) {
* An unexpected inconsistency occured. * Die if preening, otherwise just print message and continue. "%s: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.\n",
* We're exiting, it doesn't really matter that our * caller doesn't get to call va_end(). * Pwarn just prints a message when not preening, * or a warning (preceded by filename) when preening. * Like sprintf(), except the buffer is dynamically allocated * and returned, instead of being passed in. A pointer to the * buffer is stored in *RET, and FMT is the usual format string. * The number of characters in *RET (excluding the trailing \0, * to be consistent with the other *printf() routines) is returned. * Solaris doesn't have asprintf(3C) yet, unfortunately. errexit(
"Out of memory in asprintf\n");
* So we can take advantage of kernel routines in ufs_subr.c. (
void)
printf(
"INTERNAL INCONSISTENCY:");
* Check to see if unraw version of name is already mounted. * Updates devstr with the device name if devstr is not NULL * and str_size is positive. * It's mounted. With or without write access? errexit(
"fsck: memory allocation failure: %s",
* Check to see if name corresponds to an entry in vfstab, and that the entry * does not have option ro. (
void)
printf(
"WARNING: inconsistencies detected on %s filesystem %s\n",
* Carefully and transparently update the clean flag. * `iscorrupt' has to be in its final state before this is called. * set fsclean to its appropriate value * If ufs log is not okay, note that we need to clear it. * if necessary, update fs_clean and fs_state * There can be two discrepencies here. A) The superblock * shows no largefiles but we found some while scanning. * B) The superblock indicates the presence of largefiles, * but none are present. Note that if preening, the superblock "** largefile count=%d, fs.fs_flags=%x, flags_ok %d\n",
* If fs is unchanged, do nothing. "updateclean(unchanged): unlock(LOCKFS_ULOCK) failed\n");
* if user allows, update superblock state "superblock: flags 0x%x logbno %d clean %d reclaim %d state 0x%x\n",
"calculated: flags 0x%x logbno %d clean %d reclaim %d state 0x%x\n",
(
reply(
"FILE SYSTEM STATE IN SUPERBLOCK IS WRONG; FIX") == 0))
* if superblock can't be written, return * Read private copy of superblock, update clean flag, and write it. (
void)
printf(
"COULD NOT SEEK TO SUPERBLOCK AT %lld: %s\n",
(
void)
printf(
"COULD NOT SEEK TO SUPERBLOCK AT %lld: %s\n",
* If we had to use -b to grab an alternate superblock, then we * likely had to do so because of unacceptable differences between * the main and alternate superblocks. So, we had better update * the alternate superblock as well, or we'll just fail again * the next time we attempt to run fsck! "updateclean(changed): unlock(LOCKFS_ULOCK) failed\n");
(
void)
printf(
"COULD NOT %s SUPERBLOCK AT %d: %s\n",
(
void)
printf(
"COULD NOT %s SUPERBLOCK AT %d: EOF\n",
(
void)
printf(
"SHORT %s SUPERBLOCK AT %d: %u out of %u bytes\n",
if (
mountp ==
NULL)
/* theoretically a can't-happen */ * From here on, must `goto out' to avoid memory leakage. * lint believes that the ioctl() (or any other function * taking lfp as an arg) could free lfp. This is not the * Given a name which is known to be a directory, see if it appears * in the vfstab. If so, return the entry's block (special) device * Given a name which is known to be a directory, see if it appears * in the mnttab. If so, return the entry's block (special) device * Search for mount point and/or special device in the given file. * The first matching entry is returned. * If an entry is found and str_size is greater than zero, then * up to size_str bytes of the special device name from the entry /* LINTED ``assigned value never used'' */ \
errexit(
"do_errorlock(%s, %d): unallocated elock_combuf\n",
errexit(
"Couldn't alloc memory for temp. lock status buffer\n");
errexit(
"do_errorlock(%s, %d): lockfs status unallocated\n",
* Note that if it is error-locked, we won't get an * error back if we try to error-lock it again. "%s [pid:%d fsck start:%02d/%02d/%02d %02d:%02d:%02d",
"%s, done:%02d/%02d/%02d %02d:%02d:%02d]",
pwarn(
"do_errorlock: unlock failed: %s\n",
pwarn(
"Another fsck active?\n");
iscorrupt = 0;
/* don't go away mad, just go away */ pwarn(
"do_errorlock(lock_type:%d, %s) failed: %s\n",
* Shadow inode support. To register a shadow with a client is to note * that an inode (the client) refers to the shadow. errexit(
"newshadowclient: cannot malloc shadow client");
errexit(
"newshadowclient: cannot malloc client array");
* Already have a record for this shadow? * It's a new shadow, add it to the list errexit(
"registershadowclient: cannot malloc");
* Locate and discard a shadow. * Do we have a record for this shadow? * First, pull it off the list, since we know there * shouldn't be any future references to this one. * Discard all memory used to track clients of a shadow. * Allocate more buffer as need arises but allocate one at a time. * This is done to make sure that fsck does not exit with error if it * needs more buffer to complete its task. * We length-limit in both unrawname() and rawname() to avoid * overflowing our arrays or those of our naive, trusting callers. * Not reporting under debug, as the allocation isn't * reported by getfullblkname. The idea is that we * Not reporting under debug, as the allocation isn't * reported by getfullblkname. The idea is that we * Make sure that a cg header looks at least moderately reasonable. * We want to be able to trust the contents enough to be able to use * the standard accessor macros. So, besides looking at the obvious * such as the magic number, we verify that the offset field values * are properly aligned and not too big or small. * Returns a NULL pointer if the cg is sane enough for our needs, else * a dynamically-allocated string describing all of its faults. /* lint doesn't think realloc() understands NULLs */ \
errexit(
"Out of memory in cg_sanity"); \
"BAD CG MAGIC NUMBER (0x%x should be 0x%x)\n",
"WRONG CG NUMBER (%d should be %d)\n",
"BLOCK TOTALS OFFSET %d NOT FOUR-BYTE ALIGNED\n",
"FREE BLOCK POSITIONS TABLE OFFSET %d NOT TWO-BYTE ALIGNED\n",
"IMPOSSIBLE NUMBER OF CYLINDERS IN GROUP (%d is less than 1)\n",
"IMPOSSIBLE NUMBER OF CYLINDERS IN GROUP (%d is greater than %d)\n",
"INCORRECT NUMBER OF INODES IN GROUP (%d should be %d)\n",
"INCORRECT NUMBER OF DATA BLOCKS IN GROUP (%d should be %d)\n",
"IMPOSSIBLE BLOCK ALLOCATION ROTOR POSITION " "(%d should be at least 0 and less than %d)\n",
"IMPOSSIBLE FRAGMENT ALLOCATION ROTOR POSITION " "(%d should be at least 0 and less than %d)\n",
"IMPOSSIBLE INODE ALLOCATION ROTOR POSITION " "(%d should be at least 0 and less than %d)\n",
"INCORRECT BLOCK TOTALS OFFSET (%d should be %d)\n",
"BAD FREE BLOCK POSITIONS TABLE OFFSET (%d should %d)\n",
"INCORRECT USED INODE MAP OFFSET (%d should be %d)\n",
"INCORRECT FREE FRAGMENT MAP OFFSET (%d should be %d)\n",
"END OF HEADER POSITION INCORRECT (%d should be %d)\n",
* This is taken from mkfs, and is what is used to come up with the * original values for a struct cg. This implies that, since these * are all constants, recalculating them now should give us the same * thing as what's on disk. /* LINTED pointer difference won't overflow */ * Corrects all fields in the cg that can be done with the available * This is not used by the kernel, so it's pretty * harmless if it's wrong. * For the rotors, any position's valid, so pick the one we know * For btotoff and boff, if they're misaligned they won't * match the expected values, so we're catching both cases * here. Of course, if any of these are off, it seems likely * that the tables really won't be where we calculate they * We know there was at least one correctable problem, * or else we wouldn't have been called. So instead of * marking the buffer dirty N times above, just do it * Read errors will return zeros, which will cause us * to do nothing harmful, so don't need to handle it. * Does it look like a log allocation table? /* LINTED pointer cast is aligned */ for (j = 0; j <
nfno; ++j, ++
fno) {
* Invoke the callback first, so that pass1 can * mark the log blocks in-use. Then, if any * subsequent pass over the log shows us that a * block got freed (say, it was also claimed by * an inode that we cleared), we can safely declare * Simple initializer for inodesc structures, so users of only a few * fields don't have to worry about getting the right defaults for * Most fields should be zero, just hit the special cases. * Compare routine for tsearch(C) to use on ino_t instances. pfatal(
"SETTING DIRTY FLAG IN READ_ONLY MODE\n");
* Needed because calcsb() needs to use mkfs to work out what the * superblock should be, and mkfs insists on being told how many * Error handling assumes we're never called while preening. * XXX This should be extracted into a ../ufslib.{c,h}, * in the same spirit to ../../fslib.{c,h}. Once that is * done, both fsck and newfs should be modified to link * get_device_size() determines the actual size of the * device, and also the disk's attributes, such as geometry. pwarn(
"%s: Unable to read Disk geometry",
disk);
* Adjust maxcontig by the device's maxtransfer. If maxtransfer * information is not available, default to the min of a MB and * If we cannot get the maxphys value, default * to ufs_maxmaxphys (MB). * Figure out how big the partition we're dealing with is. /* it might be an EFI label */ * Since both attempts to read the label failed, we're * going to fall back to a brute force approach to * determining the device's size: see how far out we can * perform reads on the device. pwarn(
"%s: unknown error %d accessing VTOC",
* In the vtoc struct, p_size is a 32-bit signed quantity. * In the dk_gpt struct (efi's version of the vtoc), p_size * is an unsigned 64-bit quantity. By casting the vtoc's * psize to an unsigned 32-bit quantity, it will be copied * to 'slicesize' (an unsigned 64-bit diskaddr_t) without * brute_force_get_device_size * Determine the size of the device by seeing how far we can * read. Doing an llseek( , , SEEK_END) would probably work * in most cases, but we've seen at least one third-party driver * which doesn't correctly support the SEEK_END option when the * the device is greater than a terabyte. * First, see if we can read the device at all, just to * eliminate errors that have nothing to do with the return (0);
/* can't determine size */ * Now, go sequentially through the multiples of 4TB * to find the first read that fails (this isn't strictly * the most efficient way to find the actual size if the * size really could be anything between 0 and 2**64 bytes. * We expect the sizes to be less than 16 TB for some time, * so why do a bunch of reads that are larger than that? * However, this algorithm *will* work for sizes of greater * than 16 TB. We're just not optimizing for those sizes.) * XXX lint uses 32-bit arithmetic for doing flow analysis. * We're using > 32-bit constants here. Therefore, its flow * analysis is wrong. For the time being, ignore complaints * from it about the body of the for() being unreached. * XXX Same lint flow analysis problem as above. * We now know that the size of the device is less than * min_fail and greater than or equal to max_succeed. Now * keep splitting the difference until the actual size in * sectors in known. We also know that the difference * between max_succeed and min_fail at this time is * 4 * SECTORS_PER_TERABYTE, which is a power of two, which * simplifies the math below. /* the size is the last successfully read sector offset plus one */ * Adds the given inode to the orphaned-directories list, limbo_dirs. * Assumes that the caller has set INCLEAR in the inode's statemap[] * With INCLEAR set, the inode will get ignored by passes 2 and 3, * meaning it's effectively an orphan. It needs to be noted now, so * it will be remembered in pass 4. errexit(
"add_orphan_dir: out of memory");
* Remove an inode from the orphaned-directories list, presumably * because it's been cleared. * log_setsum() and log_checksum() are equivalent to lufs.c:setsum()