ztest.c revision be9000cc677e0a8d04e5be45c61d7370fc8c7b54
4632N/A * The contents of this file are subject to the terms of the 4632N/A * Common Development and Distribution License (the "License"). 4632N/A * You may not use this file except in compliance with the License. 4632N/A * See the License for the specific language governing permissions 4632N/A * and limitations under the License. 4632N/A * When distributing Covered Code, include this CDDL HEADER in each 4632N/A * If applicable, add the following below this CDDL HEADER, with the 4632N/A * fields enclosed by brackets "[]" replaced with your own identifying 4632N/A * information: Portions Copyright [yyyy] [name of copyright owner] 4632N/A * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. 4632N/A * Copyright (c) 2012 by Delphix. All rights reserved. 4632N/A * Copyright 2011 Nexenta Systems, Inc. All rights reserved. 4632N/A * Copyright (c) 2013 Steven Hartland. All rights reserved. 4632N/A * that runs entirely in userland, is easy to use, and easy to extend. 4632N/A * The overall design of the ztest program is as follows: 4647N/A * (1) For each major functional area (e.g. adding vdevs to a pool, 4647N/A * creating and destroying datasets, reading and writing objects, etc) 4647N/A * we have a simple routine to test that functionality. These 4647N/A * individual routines do not have to do anything "stressful". 4647N/A * (2) We turn these simple functionality tests into a stress test by 4647N/A * running them all in parallel, with as many threads as desired, 4647N/A * and spread across as many datasets, objects, and vdevs as desired. 4632N/A * (3) While all this is happening, we inject faults into the pool to 4632N/A * verify that self-healing data really works. 4632N/A * (4) Every time we open a dataset, we change its checksum and compression 4632N/A * functions. Thus even individual objects vary from block to block 4632N/A * in which checksum they use and whether they're compressed. 4632N/A * (5) To verify that we never lose on-disk consistency after a crash, 4639N/A * we run the entire test in a child of the main process. 4639N/A * At random times, the child self-immolates with a SIGKILL. 4639N/A * This is the software equivalent of pulling the power cord. 4639N/A * The parent then runs the test again, using the existing 4639N/A * storage pool, as many times as desired. If backwards compatability 4639N/A * testing is enabled ztest will sometimes run the "older" version 4639N/A * of ztest after a SIGKILL. 4639N/A * (6) To verify that we don't have future leaks or temporal incursions, 4639N/A * many of the functional tests record the transaction group number 4647N/A * as part of their data. When reading old data, they verify that 4647N/A * the transaction group number is less than the current, open txg. 4647N/A * If you add a new test, please do this if applicable. 4647N/A * When run with no arguments, ztest runs for about five minutes and 4632N/A * produces no output if successful. To get a little bit of information, 4632N/A * specify -V. To get more information, specify -VV, and so on. 4647N/A * To turn this into an overnight stress test, use -T to specify run time. 4639N/A * You can ask more more vdevs [-v], datasets [-d], or threads [-t] 4632N/A * to increase the pool capacity, fanout, and overall stress level. 4632N/A * Use the -k option to set the desired frequency of kills. 4632N/A * When ztest invokes itself it passes all relevant information through a 4632N/A * temporary file which is mmap-ed in the child process. This allows shared 4632N/A * memory to survive the exec syscall. The ztest_shared_hdr_t struct is always 4632N/A * stored at offset 0 of this file and contains information on the size and 4632N/A * number of shared structures in the file. The information stored in this file 4632N/A * must remain backwards compatible with older versions of ztest so that 4632N/A * ztest can invoke them during backwards compatibility testing (-B). 4632N/A * XXX -- fix zfs range locks to be generic so we can use them here. 4632N/A * Note: these aren't static because we want dladdr() to work. 4632N/A * The following struct is used to hold a list of uncalled commit callbacks. 4632N/A * The callbacks are ordered by txg number. 4632N/A * Stuff we need to share writably between parent and child. 4632N/A * The ztest_name_lock protects the pool and dataset namespace used by 4632N/A * the individual tests. To modify the namespace, consumers must grab 4632N/A * this lock as writer. Grabbing the lock as reader will ensure that the 4632N/A * namespace does not change while the lock is held. 4632N/A/* Global commit callback list */ 4632N/A * These libumem hooks provide a reasonable set of defaults for the allocator's 4632N/A return (
"default,verbose");
/* $UMEM_DEBUG setting */ 4632N/A return (
"fail,contents");
/* $UMEM_LOGGING setting */ 4632N/A "\t[-v vdevs (default: %llu)]\n" 4632N/A "\t[-s size_of_each_vdev (default: %s)]\n" 4632N/A "\t[-a alignment_shift (default: %d)] use 0 for random\n" 4632N/A "\t[-m mirror_copies (default: %d)]\n" 4632N/A "\t[-r raidz_disks (default: %d)]\n" 4632N/A "\t[-R raidz_parity (default: %d)]\n" 4632N/A "\t[-d datasets (default: %d)]\n" 4632N/A "\t[-t threads (default: %d)]\n" 4632N/A "\t[-g gang_block_threshold (default: %s)]\n" 4632N/A "\t[-i init_count (default: %d)] initialize pool i times\n" 4632N/A "\t[-k kill_percentage (default: %llu%%)]\n" 4632N/A "\t[-p pool_name (default: %s)]\n" 4632N/A "\t[-f dir (default: %s)] file directory for vdev files\n" 4632N/A "\t[-V] verbose (use multiple times for ever more blather)\n" 4632N/A "\t[-E] use existing pool instead of creating new one\n" 4632N/A "\t[-T time (default: %llu sec)] total run time\n" 4632N/A "\t[-F freezeloops (default: %llu)] max loops in spa_freeze()\n" 4632N/A "\t[-P passtime (default: %llu sec)] time per pass\n" 4632N/A "\t[-B alt_ztest (default: <none>)] alternate ztest path\n" 4632N/A "v:s:a:m:r:R:d:t:g:i:k:p:f:VET:P:hF:B:")) !=
EOF) {
* 'cmd' should be of the form "<anything>/usr/bin/<isa>/ztest". * We want to extract <isa> to determine if we should use int log,
int r,
int m,
int t)
for (c = 0; c < t; c++) {
* Find a random spa version. Returns back a random spa version in the * range [initial_version, SPA_VERSION_FEATURES]. (
void)
printf(
"%s %s = %s at '%s'\n",
* Attempt to assign tx to some transaction group. char *
name = (
void *)(
lr +
1);
/* name follows lr */ char *
name = (
void *)(
lr +
1);
/* name follows lr */ sizeof (*
lr) -
sizeof (
lr_t));
sizeof (*
lr) -
sizeof (
lr_t));
sizeof (*
lr) -
sizeof (
lr_t));
char *
name = (
void *)(
lr +
1);
/* name follows lr */ char *
name = (
void *)(
lr +
1);
/* name follows lr */ void *
data =
lr +
1;
/* data follows lr */ /* If it's a dmu_sync() block, write the whole block */ * Usually, verify the old data before writing new data -- * but not always, because we also want to verify correct * behavior when the data was not recently read into cache. * Writes can appear to be newer than the bonus buffer because * the ztest_get_data() callback does a dmu_read() of the * open-context data, which may be different than the data * as it was when the write was generated. * so that all of the usual ASSERTs will work. * Randomly change the size and increment the generation. * Verify that the current bonus buffer is not newer than our txg. NULL,
/* 0 no such transaction type */ NULL,
/* TX_CREATE_ACL */ NULL,
/* TX_CREATE_ATTR */ NULL,
/* TX_CREATE_ACL_ATTR */ NULL,
/* TX_MKDIR_ATTR */ NULL,
/* TX_MKDIR_ACL_ATTR */ if (
buf !=
NULL) {
/* immediate write */ * Lookup a bunch of objects. Returns the number of objects not found. for (
int i = 0; i <
count; i++,
od++) {
for (
int i = 0; i <
count; i++,
od++) {
lr->
lr_foid = 0;
/* 0 to allocate, > 0 to claim */ for (
int i =
count -
1; i >= 0; i--,
od--) {
* Pick an i/o type at random, biased toward writing block tags. * Induce fletcher2 collisions to ensure that * zio_ddt_collision() detects and resolves them * when using fletcher2-verify for deduplication. * Initialize an object description template. * Lookup or create the objects for a test using the od template. * If the objects do not all exist, or if 'remove' is specified, * remove any existing objects and create new ones. Otherwise, * use the existing objects. * Remember the committed values in zd, which is in parent/child * shared memory. If we die, the next iteration of ztest_run() * will verify that the log really does contain this record. * This function is designed to simulate the operations that occur during a * mount/unmount operation. We hold the dataset across these operations in an * attempt to expose any implicit assumptions about ZIL management. * We grab the zd_dirobj_lock to ensure that no other thread is * updating the zil (i.e. adding in-memory log records) and the * zd_zilog_lock to block any I/O. * Verify that we can't destroy an active pool, create an existing pool, * or create a pool with a bad vdev spec. * Attempt to create using a bad file. * Attempt to create using a bad mirror. * Attempt to create an existing pool. It shouldn't matter * what's in the nvroot; we should fail with EEXIST. * Clean up from previous runs. * If we're configuring a RAIDZ device then make sure that the * the initial version is capable of supporting that feature. * Create a pool with a spa version that can be upgraded. Pick * a value between initial_version and SPA_VERSION_BEFORE_FEATURES. (
void)
printf(
"upgrading spa version from %llu to %llu\n",
* Find the first available hole which can be used as a top-level. * Verify that vdev_add() works as expected. * If we have slogs then remove them 1/4 of the time. * Grab the guid from the head of the log class rotor. * We have to grab the zs_name_lock as writer to * prevent a race between removing a slog (dmu_objset_find) * and destroying a dataset. Removing the slog will * grab a reference on the dataset which may cause * dmu_objset_destroy() to fail with EBUSY thus * leaving the dataset in an inconsistent state. * Make 1/4 of the devices be log devices. * Verify that adding/removing aux devices (l2arc, hot spare) works as expected. * Pick a random device to remove. * Find an unused device we can add. * Remove an existing device. Sometimes, dirty its * vdev state first to make sure we handle removal * of devices that have pending state changes. * split a pool if it has mirror tlvdevs /* ensure we have a useable config; mirrors of raidz aren't supported */ /* clean up the old pool, if any */ /* generate a config from the existing config */ /* OK, create a config that can be used to split */ (
void)
printf(
"successful split - results:\n");
* Verify that we can attach and detach devices. * Decide whether to do an attach or a replace. * Pick a random top-level vdev. * Pick a random leaf within it. * If we're already doing an attach or replace, oldvd may be a * mirror vdev -- in which case, pick a random child. * If oldvd has siblings, then half of the time, detach it. * For the new vdev, choose with equal probability between the two * standard paths (ending in either 'a' or 'b') or a random hot spare. * Make newsize a little bigger or smaller than oldsize. * If it's smaller, the attach should fail. * If it's larger, and we're doing a replace, * we should get dynamic LUN growth when we're done. * If pvd is not a mirror or root, the attach should fail with ENOTSUP, * unless it's a replace; in that case any non-replacing parent is OK. * If newvd is already part of the pool, it should fail with EBUSY. * If newvd is too small, it should fail with EOVERFLOW. * Build the nvlist describing newpath. * If our parent was the replacing vdev, but the replace completed, * then instead of failing with ENOTSUP we may either succeed, * fail with ENODEV, or fail with EOVERFLOW. * If someone grew the LUN, the replacement may be too small. /* XXX workaround 6690467 */ fatal(0,
"attach (%s %llu, %s %llu, %d) " "returned %d, expected %d",
* Callback function which expands the physical size of the vdev. (
void)
printf(
"%s grew from %lu to %lu bytes\n",
* Callback function which expands a given vdev by calling vdev_online(). /* Calling vdev_online will initialize the new metaslabs */ * If vdev_online returned an error or the underlying vdev_open * failed then we abort the expand. The only way to know that * vdev_open fails is by checking the returned newstate. (
void)
printf(
"Unable to expand vdev, state %llu, " * Since we dropped the lock we need to ensure that we're * still talking to the original vdev. It's possible this (
void)
printf(
"vdev configuration has changed, " "guid %llu, state %llu, expected gen %llu, " * Traverse the vdev tree calling the supplied function. * We continue to walk the tree until we either have walked all * children or we receive a non-NULL return from the callback. * If a NULL callback is passed, then we just return back the first * leaf vdev we encounter. * Verify that dynamic LUN growth works as expected. * Determine the size of the first leaf vdev associated with * We only try to expand the vdev if it's healthy, less than 4x its * original size, and it has a valid psize. (
void)
printf(
"Expanding LUN %s from %lu to %lu\n",
* Growing the vdev is a two step process: * 1). expand the physical size (i.e. relabel) * 2). online the vdev to create the new metaslabs (
void)
printf(
"Could not expand LUN because " "the vdev configuration changed.\n");
* Expanding the LUN will update the config asynchronously, * thus we must wait for the async thread to complete any * pending tasks before proceeding. (
void)
printf(
"Could not verify LUN expansion due to " "intervening vdev offline or remove.\n");
* Make sure we were able to grow the vdev. fatal(0,
"LUN expansion failed: ms_count %llu <= %llu\n",
* Make sure we were able to grow the pool. fatal(0,
"LUN expansion failed: class_space %llu <= %llu\n",
(
void)
printf(
"%s grew from %s to %s\n",
* Verify that dmu_objset_{create,destroy,open,close} work as expected. * Create the objects common to all ztest datasets. (
void)
printf(
"Setting dataset %s to sync always\n",
dsname);
* Verify that the dataset contains a directory object. /* We could have crashed in the middle of destroying it */ * If this dataset exists from a previous run, process its replay log * half of the time. If we don't replay it, then dmu_objset_destroy() * (invoked from ztest_objset_destroy_cb()) should just throw it away. * There may be an old instance of the dataset we're about to * create lying around from a previous run. If so, destroy it * and all of its snapshots. * Verify that the destroyed dataset is no longer in the namespace. * Verify that we can create a new dataset. * Open the intent log for it. * Put some objects in there, do a little I/O to them, * and randomly take a couple of snapshots along the way. for (
int i = 0; i <
iters; i++) {
* Verify that we cannot create an existing dataset. * Verify that we can hold an objset that is also owned. * Verify that we cannot own an objset that is already owned. * Verify that dmu_snapshot_{create,destroy,open,close} work as expected. * Cleanup non-standard snapshots and clones. * Verify dsl_dataset_promote handles EBUSY * Verify that dmu_object_{alloc,free} work as expected. * Destroy the previous batch of objects, create a new batch, * and do some I/O on the new objects. * Verify that dmu_{read,write} work as expected. * This test uses two objects, packobj and bigobj, that are always * updated together (i.e. in the same tx) so that their contents are * in sync and can be compared. Their contents relate to each other * in a simple way: packobj is a dense array of 'bufwad' structures, * while bigobj is a sparse array of the same bufwads. Specifically, * for any index n, there are three bufwads that should be identical: * packobj, at offset n * sizeof (bufwad_t) * bigobj, at the head of the nth chunk * bigobj, at the tail of the nth chunk * The chunk size is arbitrary. It doesn't have to be a power of two, * and it doesn't have any relation to the object blocksize. * The only requirement is that it can hold at least two bufwads. * Normally, we write the bufwad to each of these locations. * However, free_percent of the time we instead write zeroes to * packobj and perform a dmu_free_range() on bigobj. By comparing * bigobj to packobj, we can verify that the DMU is correctly * tracking which parts of an object are allocated and free, * and that the contents of the allocated blocks are correct. * Read the directory info. If it's the first time, set things up. * Prefetch a random chunk of the big object. * Our aim here is to get some async reads in flight * for blocks that we may free below; the DMU should * handle this race correctly. * Pick a random index and compute the offsets into packobj and bigobj. * free_percent of the time, free a range of bigobj rather than * Read the current contents of our objects. * Get a tx for the mods to both packobj and bigobj. * For each index from n to n + s, verify that the existing bufwad * in packobj matches the bufwads at the head and tail of the * corresponding chunk in bigobj. Then update all three bufwads * with the new values we want to write out. for (i = 0; i < s; i++) {
fatal(0,
"future leak: got %llx, open txg is %llx",
fatal(0,
"wrong index: got %llx, wanted %llx+%llx",
* We've verified all the old bufwads, and made new ones. (
void)
printf(
"freeing offset %llx size %llx" (
void)
printf(
"writing offset %llx size %llx" * Sanity check the stuff we just wrote. * For each index from n to n + s, verify that the existing bufwad * in packobj matches the bufwads at the head and tail of the * corresponding chunk in bigobj. Then update all three bufwads * with the new values we want to write out. for (i = 0; i < s; i++) {
fatal(0,
"future leak: got %llx, open txg is %llx",
fatal(0,
"wrong index: got %llx, wanted %llx+%llx",
* This test uses two objects, packobj and bigobj, that are always * updated together (i.e. in the same tx) so that their contents are * in sync and can be compared. Their contents relate to each other * in a simple way: packobj is a dense array of 'bufwad' structures, * while bigobj is a sparse array of the same bufwads. Specifically, * for any index n, there are three bufwads that should be identical: * packobj, at offset n * sizeof (bufwad_t) * bigobj, at the head of the nth chunk * bigobj, at the tail of the nth chunk * The chunk size is set equal to bigobj block size so that * dmu_assign_arcbuf() can be tested for object updates. * Read the directory info. If it's the first time, set things up. * Pick a random index and compute the offsets into packobj and bigobj. * Iteration 0 test zcopy for DB_UNCACHED dbufs. * Iteration 1 test zcopy to already referenced dbufs. * Iteration 2 test zcopy to dirty dbuf in the same txg. * Iteration 3 test zcopy to dbuf dirty in previous txg. * Iteration 4 test zcopy when dbuf is no longer dirty. * Iteration 5 test zcopy when it can't be done. * Iteration 6 one more zcopy write. for (i = 0; i <
7; i++) {
* In iteration 5 (i == 5) use arcbufs * that don't match bigobj blksz to test * dmu_assign_arcbuf() when it can't directly * assign an arcbuf to a dbuf. for (j = 0; j < s; j++) {
* Get a tx for the mods to both packobj and bigobj. for (j = 0; j < s; j++) {
* 50% of the time don't read objects in the 1st iteration to * test dmu_assign_arcbuf() for the case when there're no * existing dbufs for the specified offsets. * We've verified all the old bufwads, and made new ones. (
void)
printf(
"writing offset %llx size %llx" * Sanity check the stuff we just wrote. * Have multiple threads write to large offsets in an object * to verify that parallel writes to an object -- even to the * same blocks within the object -- doesn't cause any trouble. * Verify that zap_{create,destroy,add,remove,update} work as expected. * Generate a known hash collision, and verify that * we can lookup and remove both entries. for (i = 0; i <
2; i++) {
for (i = 0; i <
2; i++) {
for (i = 0; i <
2; i++) {
* Generate a buch of random entries. * If these zap entries already exist, validate their contents. for (i = 0; i <
ints; i++) {
* Atomically update two entries in our zap object. * The first is named txg_%llu, and contains the txg * in which the property was last updated. The second * is named prop_%llu, and the nth element of its value * should be txg + object + n. for (i = 0; i <
ints; i++)
* Remove a random pair of entries. * Testcase to test the upgrading of a microzap to fatzap. * Add entries to this ZAP and make sure it spills over * and gets upgraded to a fatzap. Also, since we are adding * 2050 entries we should see ptrtbl growth and leaf-block split. for (
int i = 0; i <
2050; i++) {
* Generate a random name of the form 'xxx.....' where each * x is a random printable character and the dots are dots. * There are 94 such characters, and the name length goes from * 6 to 20, so there are 94^3 * 15 = 12,458,760 possible names. * Select an operation: length, lookup, add, update, remove. fatal(0,
"name '%s' != val '%s' len %d",
/* This is the actual commit callback function */ fatal(0,
"commit callback of txg %" PRIu64 " called prematurely" * The private callback data should be destroyed here, but * since we are going to check the zcd_called field after * dmu_tx_abort(), we will destroy it there. /* Was this callback added to the global callback list? */ /* Remove our callback from the list */ /* Allocate and initialize callback data structure */ * If a number of txgs equal to this threshold have been created after a commit * callback has been registered but not called, then we assume there is an /* Every once in a while, abort the transaction on purpose */ * It's not a strict requirement to call the registered * callbacks from inside dmu_tx_abort(), but that's what * it's supposed to happen in the current implementation * so we will check for that. for (i = 0; i <
2; i++) {
for (i = 0; i <
2; i++) {
* Read existing data to make sure there isn't a future leak. * Since commit callbacks don't have any ordering requirement and since * it is theoretically possible for a commit callback to be called * after an arbitrary amount of time has elapsed since its txg has been * synced, it is difficult to reliably determine whether a commit * callback hasn't been called due to high load or due to a flawed * In practice, we will assume that if after a certain number of txgs a * commit callback hasn't been called, then most likely there's an fatal(0,
"Commit callback threshold exceeded, oldest txg: %" * Let's find the place to insert our callbacks. * Even though the list is ordered by txg, it is possible for the * insertion point to not be the end because our txg may already be * quiescing at this point and other callbacks in the open txg * (from other objsets) may have sneaked in. /* Add the 3 callbacks to the list */ for (i = 0; i <
3; i++) {
* Clean up from any previous run. * Create snapshot, clone it, mark snap for deferred destroy, * destroy clone, verify snap was also destroyed. fatal(0,
"dsl_destroy_snapshot(%s, B_TRUE) = %d",
* Create snapshot, add temporary hold, verify that we can't * destroy a held snapshot, mark for deferred destroy, * release hold, verify snapshot was destroyed. fatal(0,
"dsl_destroy_snapshot(%s, B_FALSE) = %d",
fatal(0,
"dsl_destroy_snapshot(%s, B_TRUE) = %d",
* Inject random faults into the on-disk data. * Grab the name lock as reader. There are some operations * which don't like to have their vdevs changed while * they are in progress (i.e. spa_change_guid). Those * operations will have grabbed the name lock as writer. * We need SCL_STATE here because we're going to look at vd0->vdev_tsd. * Inject errors on a normal data device or slog device. * Generate paths to the first leaf in this top-level vdev, * and to the random leaf we selected. We'll induce transient * and we'll write random garbage to the randomly chosen leaf. * If the top-level vdev needs to be resilvered * then we only allow faults on the device that is * Make vd0 explicitly claim to be unreadable, * or unwriteable, or reach behind its back * and close the underlying fd. We can do this if * maxfaults == 0 because we'll fail and reexecute, * and we can do it if maxfaults >= 2 because we'll * have enough redundancy. If maxfaults == 1, the * combination of this with injection of random data * corruption below exceeds the pool's fault tolerance. * Inject errors on an l2cache device. * If we can tolerate two or more faults, or we're dealing * We have to grab the zs_name_lock as writer to * prevent a race between offlining a slog and * destroying a dataset. Offlining the slog will * grab a reference on the dataset which may cause * dmu_objset_destroy() to fail with EBUSY thus * leaving the dataset in an inconsistent state. * Ideally we would like to be able to randomly * call vdev_[on|off]line without holding locks * to force unpredictable failures but the side * effects of vdev_[on|off]line prevent us from * doing so. We grab the ztest_vdev_lock here to * prevent a race between injection testing and * We have at least single-fault tolerance, so inject data corruption. if (
fd == -
1)
/* we hit a gap in the device namespace */ fatal(
1,
"can't inject bad word at 0x%llx in %s",
(
void)
printf(
"injected bad word into %s," * Verify that DDT repair works as expected. * Take the name lock as writer to prevent anyone else from changing * the pool and dataset properies we need to maintain during this test. * Write all the copies of our block. for (
int i = 0; i <
copies; i++) {
* Find out what block we got. * Damage the block. Dedup-ditto will save us when we read it later. (
void)
poll(
NULL, 0,
100);
/* wait a moment, then force a restart */ * Change the guid for the pool. (
void)
printf(
"Changed guid old %llu -> %llu\n",
* Rename the pool to a different name and then rename it back. * Try to open it under the old name, which shouldn't exist * Open it under the new name and make sure it's still the same spa_t. * Rename it back to the original * Make sure it can still be opened * Verify pool integrity by running zdb. * Clean up from previous runs. * Get the pool's configuration and guid. * Import it under the new name. * Try to import it again -- should fail with EEXIST. * Try to import it under a different name -- should fail with EEXIST. * Verify that the pool is no longer visible under the old name. * Verify that we can open and close the pool using the new name. (
void)
printf(
"resuming from suspended state\n");
* If the pool is suspended then fail immediately. Otherwise, * check to see if the pool is making any progress. If * vdev_deadman() discovers that there hasn't been any recent * I/Os then it will end up aborting the tests. fatal(0,
"aborting test after %llu seconds because " "pool has transitioned to a suspended state.",
(
void)
printf(
"ztest has been running for %lld seconds\n",
(
void)
printf(
"%6.2f sec in %s\n",
* See if it's time to force a crash. * If we're getting ENOSPC with some regularity, stop. * Pick a random function to execute. (
void)
printf(
"Destroying %s to free up space\n",
name);
* Cleanup any non-standard clones and snapshots. In general, * ztest thread t operates on dataset (t % zopt_datasets), * so there may be more than one thing to clean up. * ZTEST_DIROBJ is the object directory for the entire dataset. * Therefore, the number of objects in use should equal the * number of ZTEST_DIROBJ entries, +1 for ZTEST_DIROBJ itself. * If not, we have an object leak. * Note that we can only check this in ztest_dataset_open(), * when the open-context and syncing-context values agree. * That's because zap_count() returns the open-context value, * while dmu_objset_space() returns the rootbp fill count. fatal(0,
"missing log records: claimed %llu < committed %llu",
(
void)
printf(
"%s replay %llu blocks, %llu records, seq %llu\n",
fatal(0,
"missing log records: replayed %llu < committed %llu",
* Kick off threads to run tests on all datasets in parallel. * We don't expect the pool to suspend unless maxfaults == 0, * in which case ztest_fault_inject() temporarily takes away * the only valid replica. * Create a thread to periodically resume suspended I/O. * Create a deadman thread to abort() if we hang. * Verify that we can safely inquire about about any object, * whether it's allocated or not. To make it interesting, * we probe a 5-wide window around each power of two. * This hits all edge cases, including zero and the max. for (
int t = 0; t <
64; t++) {
for (
int d = -
5; d <=
5; d++) {
* If we got any ENOSPC errors on the previous run, destroy something. (
void)
printf(
"starting main threads...\n");
* Kick off all the tests that run in parallel. * Wait for all of the tests to complete. We go in reverse order * so we don't close datasets while threads are still using them. /* Kill the resume thread */ * Right before closing the pool, kick off a bunch of async I/O; * spa_close() should wait for it to complete. * Verify that we can loop over all pools. * Verify that we can export the pool and reimport it under a (
void)
printf(
"testing spa_freeze()...\n");
* Force the first log block to be transactionally allocated. * We have to do this before we freeze the pool -- otherwise * the log chain won't be anchored. * Freeze the pool. This stops spa_sync() from doing anything, * so that the only way to record changes from now on is the ZIL. * Run tests that generate log records but don't alter the pool config * or depend on DSL sync tasks (snapshots, objset create/destroy, etc). * We do a txg_wait_synced() after each iteration to force the txg * to increase well beyond the last synced value in the uberblock. * The ZIL should be OK with that. * Commit all of the changes we just generated. * Close our dataset and close the pool. * Open and close the pool and dataset to induce log replay. "%llud%02lluh%02llum%02llus", d, h, m, s);
* Create a storage pool with the given name and initial vdev size. * Then test spa_freeze() functionality. * Create the storage pool. if (
pid == 0) {
/* child */ * Blow away any existing copy of zpool.cache * Create and initialize our storage pool. (
void)
printf(
"ztest_init(), pass %d\n", i);
/* Override location of zpool.cache */ (
void)
printf(
"%llu vdevs, %d datasets, %d threads," (
void)
printf(
"Executing older ztest for " * Run the tests in a loop. These tests include fault injection * to verify that self-healing data works, and forced crashes * to verify that we never lose on-disk consistency. * Initialize the workload counters for each function. /* Set the allocation switch size */ (
void)
printf(
"Executing newer ztest: %s\n",
(
void)
printf(
"Executing older ztest: %s\n",
(
void)
printf(
"Pass %3d, %8s, %3llu ENOSPC, " "%4.1f%% of %5s used, %3.0f%% done, %8s to go\n",
(
void)
printf(
"\nWorkload summary:\n\n");
"Calls",
"Time",
"Function");
"-----",
"----",
"--------");
(
void)
printf(
"%7llu %9s %s\n",
* It's possible that we killed a child during a rename test, * in which case we'll have a 'ztest_tmp' pool lying around * instead of 'ztest'. Do a blind rename in case this happened. (
void)
printf(
"%d killed, %d completed, %.0f%% kill rate\n",