2N/A * The contents of this file are subject to the terms of the 2N/A * Common Development and Distribution License (the "License"). 2N/A * You may not use this file except in compliance with the License. 2N/A * See the License for the specific language governing permissions 2N/A * and limitations under the License. 2N/A * When distributing Covered Code, include this CDDL HEADER in each 2N/A * If applicable, add the following below this CDDL HEADER, with the 2N/A * fields enclosed by brackets "[]" replaced with your own identifying 2N/A * information: Portions Copyright [yyyy] [name of copyright owner] 2N/A * Copyright (c) 2005, 2012, Oracle and/or its affiliates. All rights reserved. 2N/A * Functions to convert between a list of vdevs and an nvlist representing the 2N/A * configuration. Each entry in the list can be one of: 2N/A * disk=(path=..., devid=...) 2N/A * While the underlying implementation supports it, group vdevs cannot contain 2N/A * other group vdevs. All userland verification of devices is contained within 2N/A * this file. If successful, the nvlist returned can be passed directly to the 2N/A * kernel; we've done as much verification as possible in userland. 2N/A * Hot spares are a special case, and passed down as an array of disk vdevs, at 2N/A * the same level as the root of the vdev tree. 2N/A * Functions exported by this file: 2N/A * 'zpool_make_root_vdev' - the function performs several passes: 2N/A * 1. Construct the vdev specification. Performs syntax validation and 2N/A * makes sure each device is valid. 2N/A * 2. Check for devices in use. Using libdiskmgt, makes sure that no 2N/A * devices are also in use. Some can be overridden using the 'force' 2N/A * flag, others cannot. 2N/A * 3. Check for replication errors if the 'force' flag is not specified. 2N/A * validates that the replication level is consistent across the 2N/A * 4. Label any whole disks with an EFI label. 2N/A * 'zpool_split_root_vdev' - the function performs several passes: 2N/A * 1. Construct the vdev specification. Performs syntax validation and 2N/A * makes sure each device is valid. 2N/A * 2. Check target device if its eligable for splitting. 2N/A * 3. Initiate the actual split. 2N/A * For any given vdev specification, we can have multiple errors. The 2N/A * vdev_error() function keeps track of whether we have seen an error yet, and 2N/A * prints out a header if its the first error we've seen. 2N/A "use -f to override the following errors:"));
2N/A "the following errors must be manually repaired:"));
2N/A "vdev verification failed"));
2N/A * /dev/dsk. Don't bother printing an error message in this case. 2N/A * Validate a device, passing the bulk of the work off to libdiskmgt. 2N/A * If we're given a whole disk, ignore overlapping slices since we're 2N/A * about to label it anyway. 2N/A /* dm_isoverlapping returned -1 */ 2N/A /* libdiskmgt's devcache only handles physical drives */ 2N/A * Validate a whole disk. Iterate over all slices on the disk and make sure 2N/A * that none is in use by calling check_slice(). 2N/A * Get the drive associated with this disk. This should never fail, 2N/A * because we already have an alias handle open for the device. 2N/A * It is possible that the user has specified a removable media drive, 2N/A * and the media is not present. 2N/A * Iterate over all slices and report any errors. We don't care about 2N/A * overlapping slices because we are using the whole disk. 2N/A * Validate a device. 2N/A * For whole disks, libdiskmgt does not include the leading dev path. 2N/A * Check that a file is valid. All we can do in this case is check that it's 2N/A * not in use by another pool, and not in use by swap. 2N/A "by swap. Please see swap(1M).\n"),
file);
2N/A * Allow hot spares to be shared between pools. 2N/A "for ZFS pool '%s'. Please see " 2N/A "'%s'. Please see zpool(1M)\n"),
2N/A * By "whole disk" we mean an entire physical disk (something we can 2N/A * label, toggle the write cache on, etc.) as opposed to the full 2N/A * capacity of a pseudo-device such as lofi or did. We act as if we 2N/A * are labeling the disk, which should be a pretty good test of whether 2N/A * it's a viable device or not. Returns B_TRUE if it is and B_FALSE if 2N/A * Create a leaf vdev. Determine if this is a file or a device. If it's a 2N/A * device, fill in the device id to make a complete nvlist. Valid forms for a 2N/A * /xxx Full path to file 2N/A * It is sufficient to only record the failure cause as we fail immediately, 2N/A * and the caller will issue the entire error message with zfs_error(). 2N/A * If 'arg' is a symlink, read the link and try to convert it into 2N/A * a whole-disk ctd name. This is done to support references to 2N/A * /devices path. If the di_cro interfaces can convert the link value 2N/A * into a disk ctd name, then we proceed with that ctd name. 2N/A * NOTE: To insulate us from new cro fields, we use the string-based 2N/A * Determine what type of vdev this is, and put the full path into 2N/A * 'path'. We detect whether this is a device of file afterwards by 2N/A * checking the st_mode of the file. 2N/A * Complete device or file path. Exact type is determined by 2N/A * examining the file descriptor afterwards. 2N/A * This may be a short path for a device, or it could be total 2N/A * gibberish. Check to see if it's a known device in 2N/A * /dev/dsk/. As part of this check, see if we've been given a 2N/A * an entire disk (minus the slice number). 2N/A * If we got ENOENT, then the user gave us 2N/A * gibberish, so try to direct them with a 2N/A * reasonable error message. Otherwise, 2N/A * regurgitate strerror() since it's the best we 2N/A "cannot open '%s': no such device in %s\n" 2N/A "must be a full path or shorthand device " 2N/A "cannot open '%s': %s"),
2N/A * Determine whether this is a device or a file. 2N/A "must be a block device or regular file"),
path);
2N/A * Finally, we have the complete device or file, and we know that it is 2N/A * acceptable to use. Construct the nvlist to describe this vdev. All 2N/A * vdevs have a 'path' element, and devices also have a 'devid' element. 2N/A * For a whole disk, defer getting its devid until after labeling it. 2N/A * Get the devid for the device. 2N/A * Go through and verify the replication level of the pool is consistent. 2N/A * Performs the following checks: 2N/A * For the new spec, verifies that devices in mirrors and raidz are the 2N/A * If the current configuration already has inconsistent replication 2N/A * levels, ignore any other potential problems in the new spec. 2N/A * Otherwise, make sure that the current spec (if there is one) and the new 2N/A * spec have consistent replication levels. 2N/A * Check one top-level vdev and its children to ensure we don't mix devices 2N/A * and files, and that the leaf vdevs are of approximately the same size. 2N/A * This is a mirror or RAID-Z vdev. Go through and make 2N/A * sure the contents are all the same (files vs. disks), 2N/A * keeping track of the number of elements in the process. 2N/A * We also check that the size of each vdev (if it can 2N/A * be determined) is the same. 2N/A * The 'reported' variable indicates that we've 2N/A * already reported an error for this spec, so don't 2N/A * bother doing it again. 2N/A * If this is a replacing or spare vdev, then 2N/A * get the real first child of the vdev. 2N/A * with files, report it as an error. 2N/A "mismatched replication level: %s contains both " 2N/A * According to stat(2), the value of 'st_size' is undefined 2N/A * for block devices and character devices. But there is no 2N/A * effective way to determine the real size in userland. 2N/A * Instead, we'll take advantage of an implementation detail of 2N/A * spec_size(). If the device is currently open, then we 2N/A * (should) return a valid size. 2N/A * If we still don't get a valid size (indicated by a size of 0 2N/A * or MAXOFFSET_T), then ignore this device altogether. 2N/A * Also make sure that devices and slices have a consistent 2N/A * size. If they differ by a significant amount then report 2N/A "%s contains devices of different sizes\n"),
2N/A /* success if we haven't reported an error */ 2N/A * Check the replication levels in the existing pool's config and report 2N/A * an error if there is a mismatch. 2N/A "mismatched replication level: both %s " 2N/A "and %s vdevs are present\n"),
2N/A "mismatched replication level: both %llu " 2N/A "and %llu and %llu device parity %s " 2N/A "vdevs are present\n"),
2N/A "mismatched replication level: both " 2N/A "%llu-way and %llu-way %s vdevs are " 2N/A * Check the replication levels between the existing config and the 2N/A * config that the user wants to have. This differs from compare_rep_levels() 2N/A * only in the strings we output. 2N/A "mismatched replication level: pool uses %s and new " 2N/A "mismatched replication level: pool uses %llu device " 2N/A "parity and new vdev uses %llu\n"),
2N/A "mismatched replication level: pool uses %llu-way %s " 2N/A "and new vdev uses %llu-way %s\n"),
2N/A * Given a list of toplevel vdevs, determine if the configuration is 2N/A * self-consistent. If the config is inconsistent, return false. If 'fatal' is 2N/A * set, then an error message will be displayed for each self-inconsistent 2N/A * For separate logs we ignore the top level vdev replication 2N/A * This is a 'file' or 'disk' vdev. 2N/A * If this is a raidz device, update the parity. 2N/A * At this point, we have the replication of the last toplevel 2N/A * vdev in 'rep'. Compare it to the pool or the slog. 2N/A * Check the replication level of the vdev spec against the current pool. Calls 2N/A * valid_replication() to make sure the new spec is self-consistent. If the 2N/A * pool has a consistent replication level, then we ignore any errors. 2N/A * Otherwise, report any difference between the two. 2N/A * If we have a current pool configuration, check to see if it's 2N/A * self-consistent. If not, simply return success. 2N/A * for spares there may be no children, and therefore no 2N/A * replication level to check 2N/A * Get the replication level of the new vdev spec, reporting any 2N/A * inconsistencies found. 2N/A * Check to see if the new vdev spec matches the replication level of 2N/A * Go through and find any whole disks in the vdev specification, labelling them 2N/A * as appropriate. When constructing the vdev spec, we were unable to open this 2N/A * device in order to provide a devid. Now that we have labelled the disk and 2N/A * know that slice 0 is valid, we can construct the devid now. 2N/A * If the disk was already labeled with an EFI label, we will have gotten the 2N/A * devid already (because we were able to open the whole disk). Otherwise, we 2N/A * need to get the devid after we label the disk. 2N/A * We have a disk device. Get the path to the device 2N/A * and see if it's a whole disk by appending the backup 2N/A * slice and stat()ing the device. 2N/A * Fill in the devid, now that we've labeled the disk. 2N/A * Update the path to refer to the slice. The presence of 2N/A * the 'whole_disk' field indicates to the CLI that we should 2N/A * chop off the slice number when displaying the device in 2N/A * Determine if the given path is a hot spare within the given configuration. 2N/A * Go through and find any devices that are in use. We rely on libdiskmgt for 2N/A * the majority of this task. 2N/A * As a generic check, we look to see if this is a replace of a 2N/A * hot spare within the same pool. If so, we allow it 2N/A * regardless of what libdiskmgt or zpool_in_use() says. 2N/A }
else if (*p ==
'0') {
2N/A return (
NULL);
/* no zero prefixes allowed */ 2N/A * Construct a syntactically valid vdev specification, 2N/A * and ensure that all devices and files exist and can be opened. 2N/A * It is sufficient to only record the failure cause as we fail immediately, 2N/A * and the caller will issue the entire error message with zfs_error(). 2N/A "invalid vdev specification: "));
2N/A * If it's a mirror or raidz, the subsequent arguments are 2N/A * its leaves -- until we encounter the next mirror or raidz. 2N/A "%s'%s' can be specified only " 2N/A "%s'%s' can be specified only " 2N/A * A log is not a real grouping device. 2N/A * We just set is_log and continue. 2N/A "%s'%s' can be specified only " 2N/A "%sunsupported '%s' device: %s"),
2N/A "%s%s requires at least %d devices"),
2N/A "%s%s supports no more then %d devices"),
2N/A * We have a device. Pass off to make_leaf_vdev() to 2N/A * construct the appropriate nvlist describing the vdev. 2N/A "%sat least one toplevel vdev must be specified"),
msg);
2N/A * Finally, create nvroot and add all top-level vdevs to it. 2N/A /* avoid any tricks in the spec */ 2N/A "invalid vdev specification: cannot use " 2N/A "'%s' as a device for splitting"),
2N/A /* zpool_vdev_split() does the error messaging here */ 2N/A * Get and validate the contents of the given vdev specification. This ensures 2N/A * that the nvlist returned is well-formed, that all the devices exist, and that 2N/A * they are not currently in use by any other known consumer. The 'poolconfig' 2N/A * parameter is the current configuration of the pool when adding devices 2N/A * existing pool, and is used to perform additional checks, such as changing the 2N/A * replication level of the pool. It can be 'NULL' to indicate that this is a 2N/A * new pool. The 'force' flag controls whether devices should be forcefully 2N/A * added, even if they appear in use. 2N/A "Unable to build pool from specified devices"));
2N/A * Construct the vdev specification. If this is successful, we know 2N/A * that we have a valid specification, and that all devices can be 2N/A * Validate each device to make sure that its not shared with another 2N/A * subsystem. We do this even if 'force' is set, because there are some 2N/A * uses (such as a dedicated dump device) that even '-f' cannot 2N/A * Check the replication level of the given vdevs and report any errors 2N/A * found. We include the existing pool spec, if any, as we need to 2N/A * catch changes against the existing replication level. 2N/A * Run through the vdev specification and label any whole disks found.