nfs4_client.c revision 84d68d8e929eb898bba40f17b9966212f1a66de8
2N/A * The contents of this file are subject to the terms of the 2N/A * Common Development and Distribution License (the "License"). 2N/A * You may not use this file except in compliance with the License. 2N/A * See the License for the specific language governing permissions 2N/A * and limitations under the License. 2N/A * When distributing Covered Code, include this CDDL HEADER in each 2N/A * If applicable, add the following below this CDDL HEADER, with the 2N/A * fields enclosed by brackets "[]" replaced with your own identifying 2N/A * information: Portions Copyright [yyyy] [name of copyright owner] 2N/A * Copyright 2006 Sun Microsystems, Inc. All rights reserved. 2N/A * Use is subject to license terms. 2N/A * Copyright (c) 1983,1984,1985,1986,1987,1988,1989 AT&T. 2N/A * All Rights Reserved 2N/A#
pragma ident "%Z%%M% %I% %E% SMI" 2N/A * Arguments to page-flush thread. 2N/A/* temporary: panic if v_type is inconsistent with r_attr va_type */ 2N/A * Attributes caching: 2N/A * Attributes are cached in the rnode in struct vattr form. 2N/A * There is a time associated with the cached attributes (r_time_attr_inval) 2N/A * which tells whether the attributes are valid. The time is initialized 2N/A * to the difference between current time and the modify time of the vnode 2N/A * when new attributes are cached. This allows the attributes for 2N/A * files that have changed recently to be timed out sooner than for files 2N/A * that have not changed for a long time. There are minimum and maximum 2N/A * timeout values that can be set per mount point. 2N/A * If a cache purge is in progress, wait for it to finish. 2N/A * The current thread must not be in the middle of an 2N/A * between this thread, a recovery thread, and the page flush thread. 2N/A * Validate caches by checking cached attributes. If they have timed out, 2N/A * then get new attributes from the server. As a side effect, cache 2N/A * invalidation is done if the attributes have changed. 2N/A * If the attributes have not timed out and if there is a cache 2N/A * invalidation being done by some other thread, then wait until that 2N/A * thread has completed the cache invalidation. 2N/A * Fill in attribute from the cache. 2N/A * If valid, then return 0 to indicate that no error occurred, 2N/A * otherwise return 1 to indicate that an error occurred. 2N/A * Cached attributes are valid 2N/A * If returned error is ESTALE flush all caches. The nfs4_purge_caches() 2N/A * call is synchronous because all the pages were invalidated by the 2N/A * nfs4_invalidate_pages() call. 2N/A /* Ensure that the ..._end_op() call has been done */ 2N/A * Purge all of the various NFS `data' caches. If "asyncpg" is TRUE, the 2N/A * page purge is done asynchronously. 2N/A * Purge the DNLC for any entries which refer to this file. 2N/A * Clear any readdir state bits and purge the readlink response cache. 2N/A * Purge pathconf cache too. 2N/A * Flush the page cache. If the current thread is the page flush 2N/A * thread, don't initiate a new page flush. There's no need for 2N/A * it, and doing it correctly is hard. 2N/A * We don't hold r_statelock while creating the 2N/A * thread, in case the call blocks. So we use a 2N/A * flag to indicate that a page flush thread is 2N/A * Flush the readdir response cache. 2N/A * Invalidate all pages for the given file, after writing back the dirty 2N/A * Page flush thread. 2N/A /* remember which thread we are, so we don't deadlock ourselves */ 2N/A * Purge the readdir cache of all entries which are not currently 2N/A * Set attributes cache for given vnode using virtual attributes. There is 2N/A * no cache validation, but if the attributes are deemed to be stale, they 2N/A * are ignored. This corresponds to nfs3_attrcache(). 2N/A * Set the timeout value on the attribute cache and fill it 2N/A * with the passed in attributes. 2N/A * Use the passed in virtual attributes to check to see whether the 2N/A * data and metadata caches are valid, cache the new attributes, and 2N/A * then do the cache invalidation if required. 2N/A * The cache validation and caching of the new attributes is done 2N/A * atomically via the use of the mutex, r_statelock. If required, 2N/A * the cache invalidation is done atomically w.r.t. the cache 2N/A * validation and caching of the attributes via the pseudo lock, 2N/A * This routine is used to do cache validation and attributes caching 2N/A * for operations with a single set of post operation attributes. 2N/A /* Is curthread the recovery thread? */ 2N/A * If we're the recovery thread, then purge current attrs 2N/A * and bail out to avoid potential deadlock between another 2N/A * thread caching attrs (r_serial thread), recov thread, 2N/A * and an async writer thread. 2N/A * If there is a page flush thread, the current thread needs to 2N/A * bail out, to prevent a possible deadlock between the current 2N/A * recovery thread, and the page flush thread. Expire the 2N/A * attribute cache, so that any attributes the current thread was 2N/A * going to set are not lost. * Attributes have been cached since these attributes were * made, so don't act on them. * Only directory modifying callers pass non-NULL cinfo. * If the cache timeout either doesn't exist or hasn't expired, * and dir didn't changed on server before dirmod op * and dir didn't change after dirmod op but before getattr * then there's a chance that the client's cached data for * this object is current (not stale). No immediate cache * cannot be blindly trusted. For this case, we tell * nfs4_attrcache_va to cache the attrs but also * establish an absolute maximum cache timeout. When * the timeout is reached, caches will be flushed. * We're not sure exactly what changed, but we know * what to do. flush all caches for dir. remove the * a) timeout expired. flush all caches. * b) r_change != cinfo.before. flush all caches. * c) r_change == cinfo.before, but cinfo.after != * post-op getattr(change). flush all caches. * d) post-op getattr(change) not provided by server. * If we're the recov thread, then force async nfs4_purge_caches * to avoid potential deadlock. * Set attributes cache for given vnode using virtual attributes. * Set the timeout value on the attribute cache and fill it * with the passed in attributes. * The caller must be holding r_statelock. /* Switch to master before checking v_flag */ * Only establish a new cache timeout (if requested). Never * extend a timeout. Never clear a timeout. Clearing a timeout * is done by nfs4_update_dircaches (ancestor in our call chain) * Delta is the number of nanoseconds that we will * cache the attributes of the file. It is based on * the number of nanoseconds since the last time that * we detected a change. The assumption is that files * that changed recently are likely to change again. * There is a minimum and a maximum for regular files * and for directories which is enforced though. * Using the time since last change was detected * eliminates direct comparison or calculation * using mixed client and server times. NFS does * not make any assumptions regarding the client * and server clocks being synchronized. * The attributes that were returned may be valid and can * be used, but they may not be allowed to be cached. * Reset the timers to cause immediate invalidation and * clear r_change so no VERIFY operations will suceed * If mounted_on_fileid returned AND the object is a stub, * then set object's va_nodeid to the mounted over fid * just set it to 0 for now. Eventually it would be * better to set it to a hashed version of FH. This * would probably be good enough to provide a unique * We don't need to carry mounted_on_fileid in the * rnode as long as the client never requests fileid * without also requesting mounted_on_fileid. For * Check to see if there are valid pathconf bits to * Update the size of the file if there is no cached data or if * the cached data is clean and there is no data being written * Get attributes over-the-wire and update attributes cache * if no error occurred in the over-the-wire operation. * Return 0 if successful, otherwise error. /* Save the original mount point security flavor */ * If getattr a node that is a stub for a crossed * mount point, keep the original secinfo flavor for * the current file system, not the crossed one. * Generate a compound to get attributes over-the-wire. * Unlike nfs version 2 and 3, where getattr returns all the * attributes, nfs version 4 returns only the ones explicitely * asked for. This creates problems, as some system functions * (e.g. cache check) require certain attributes and if the * cached node lacks some attributes such as uid/gid, it can * affect system utilities (e.g. "ls") that rely on the information * to be there. This can lead to anything from system crashes to * corrupted information processed by user apps. * So to ensure that all bases are covered, request at least * the AT_ALL attribute mask. * Return either cached or remote attributes. If get remote attr * use them to check and invalidate caches, then cache the new attributes. * If we've got cached attributes, we're done, otherwise go * to the server to get attributes, which will update the cache * Cached attributes are valid * Return the client's view of file size /* Return the client's view of file size */ "nfs4_attr_otw: %s call, rp %s",
needrecov ?
"recov" :
"first",
"nfs4_attr_otw: initiating recovery\n"));
* Asynchronous I/O parameters. nfs_async_threads is the high-water mark * for the demand-based allocation of async threads per-mount. The * nfs_async_timeout is the amount of time a thread will live after it * becomes idle, unless new I/O requests are received before the thread * dies. See nfs4_async_putpage and nfs4_async_start. * Cross-zone thread creation and NFS access is disallowed, yet fsflush() and * pageout(), running in the global zone, have legitimate reasons to do * VOP_PUTPAGE(B_ASYNC) on other zones' NFS mounts. We avoid the problem by * use of a a per-mount "asynchronous requests manager thread" which is * signaled by the various asynchronous work routines when there is * asynchronous work to be done. It is responsible for creating new * worker threads if necessary, and notifying existing worker threads * that there is work to be done. * In other words, it will "take the specifications from the customers and * give them to the engineers." * Worker threads die off of their own accord if they are no longer * This thread is killed when the zone is going away or the filesystem * We want to stash the max number of threads that this mount was * allowed so we can use it later when the variable is set to zero as * We want to be able to create at least one thread to handle * asyncrhonous inactive calls. * We don't want to wait for mi_max_threads to go to zero, since that * happens as part of a failed unmount, but this thread should only * exit when the mount is really going away. * Once MI4_ASYNC_MGR_STOP is set, no more async operations will be * attempted: the various _async_*() functions know to do things * inline if mi_max_threads == 0. Henceforth we just drain out the * Note that we still create zthreads even if we notice the zone is * shutting down (MI4_ASYNC_MGR_STOP is set); this may cause the zone * shutdown sequence to take slightly longer in some cases, but * doesn't violate the protocol, as all threads will exit as soon as * they're done processing the remaining requests. * Paranoia: If the mount started out having * (mi->mi_max_threads == 0), and the value was * later changed (via a debugger or somesuch), * we could be confused since we will think we * can't create any threads, and the calling * code (which looks at the current value of * mi->mi_max_threads, now non-zero) thinks we * So, because we're paranoid, we create threads * up to the maximum of the original and the * current value. This means that future * (debugger-induced) alterations of * mi->mi_max_threads are ignored for our * purposes, but who told them they could change * random values on a live kernel anyhow? "nfs4_async_manager exiting for vfs %p\n", (
void *)
mi->
mi_vfsp));
* Let everyone know we're done. * Wake up the inactive thread. * Wake up anyone sitting in nfs4_async_manager_stop() * There is no explicit call to mutex_exit(&mi->mi_async_lock) * since CALLB_CPR_EXIT is actually responsible for releasing * Signal (and wait for) the async manager thread to clean up and go away. * Wait for the async manager thread to die. * If addr falls in a different segment, don't bother doing readahead. * If we can't allocate a request structure, punt on the readahead. * If a lock operation is pending, don't initiate any new * readaheads. Otherwise, bump r_count to indicate the new * If asyncio has been disabled, don't bother readahead. * Link request structure into the async list and * wakeup async thread to do the i/o. * The async queues for each mounted file system are arranged as a * set of queues, one for each async i/o type. Requests are taken * from the queues in a round-robin fashion. A number of consecutive * requests are taken from each queue before moving on to the next * queue. This functionality may allow the NFS Version 2 server to do * write clustering, even if the client is mixing writes and reads * because it will take multiple write requests from the queue * before processing any of the other async i/o types. * XXX The nfs4_async_start thread is unsafe in the light of the present * model defined by cpr to suspend the system. Specifically over the * wire calls are cpr-unsafe. The thread should be reevaluated in * case of future updates to the cpr model. * Dynamic initialization of nfs_async_timeout to allow nfs to be * built in an implementation independent manner. * Find the next queue containing an entry. We start * at the current queue pointer and then round robin * through all of them until we either find a non-empty * queue or have looked through all of them. * If we didn't find a entry, then block until woken up * again and then look through the queues again. * Exiting is considered to be safe for CPR as well * Wakeup thread waiting to unmount the file * system only if all async threads are inactive. * If we've timed-out and there's nothing to do, * then get rid of this thread. * Remove the request from the async queue and then * update the current async request queue pointer. If * the current queue is empty or we have removed enough * consecutive entries from it, then reset the counter * for this queue and then move the current pointer to * Obtain arguments from the async request structure. * Now, release the vnode and free the credentials * Reacquire the mutex because it will be needed above. * nfs4_inactive_thread - look for vnodes that need over-the-wire calls as * We don't want to exit until the async manager is done * with its work; hence the check for mi_manager_thread * The async manager thread will cv_broadcast() on * mi_inact_req_cv when it's done, at which point we'll * There is no explicit call to mutex_exit(&mi->mi_async_lock) since * CALLB_CPR_EXIT is actually responsible for releasing 'mi_async_lock'. "nfs4_inactive_thread exiting for vfs %p\n", (
void *)
vfsp));
* Wait for all outstanding putpage operations and the inactive thread to * complete; nfs4_async_stop_sig() without interruptibility. * Wait for all outstanding async operations to complete and for * worker threads to exit. * Wait for the inactive thread to finish doing what it's doing. It * won't exit until the last reference to the vfs_t goes away. * Wait for all outstanding putpage operations and the inactive thread to * complete. If a signal is delivered we will abort and return non-zero; * otherwise return 0. Since this routine is called from nfs4_unmount, we * need to make it interruptable. * Wait for all outstanding putpage operations to complete and for * worker threads to exit. * Wait for the inactive thread to finish doing what it's doing. It * won't exit until the a last reference to the vfs_t goes away. * If we can't allocate a request structure, do the putpage * operation synchronously in this thread's context. * If asyncio has been disabled, then make a synchronous request. * This check is done a second time in case async io was diabled * while this thread was blocked waiting for memory pressure to * reduce or for the queue to drain. * Link request structure into the async list and * wakeup async thread to do the i/o. * or we have run out of memory or we're attempting to * unmount we refuse to do a sync write, because this may * we just re-mark the page as dirty and punt on the page. * Make sure B_FORCE isn't set. We can re-mark the * pages as dirty and unlock the pages in one swoop by * passing in B_ERROR to pvn_write_done(). However, * we should make sure B_FORCE isn't set - we don't * want the page tossed before it gets written out. * We'll get here only if (nfs_zone() != mi->mi_zone) * which means that this was a cross-zone sync putpage. * We pass in B_ERROR to pvn_write_done() to re-mark the pages * as dirty and unlock them. * We don't want to clear B_FORCE here as the caller presumably * knows what they're doing if they set it. * If we can't allocate a request structure, do the pageio * request synchronously in this thread's context. * If asyncio has been disabled, then make a synchronous request. * This check is done a second time in case async io was diabled * while this thread was blocked waiting for memory pressure to * reduce or for the queue to drain. * Link request structure into the async list and * wakeup async thread to do the i/o. * If we can't do it ASYNC, for reads we do nothing (but cleanup * the page list), for writes we do it synchronously, except for * we refuse to do a sync write, because this may hang * re-mark the page as dirty and punt on the page. * Make sure B_FORCE isn't set. We can re-mark the * pages as dirty and unlock the pages in one swoop by * passing in B_ERROR to pvn_write_done(). However, * we should make sure B_FORCE isn't set - we don't * want the page tossed before it gets written out. * So this was a cross-zone sync pageio. We pass in B_ERROR * to pvn_write_done() to re-mark the pages as dirty and unlock * We don't want to clear B_FORCE here as the caller presumably * knows what they're doing if they set it. * If we can't allocate a request structure, skip the readdir. * If asyncio has been disabled, then skip this request * Link request structure into the async list and * wakeup async thread to do the i/o. * Indicate that no one is trying to fill this entry and * it still needs to be filled. * If we can't allocate a request structure, do the commit * operation synchronously in this thread's context. * If asyncio has been disabled, then make a synchronous request. * This check is done a second time in case async io was diabled * while this thread was blocked waiting for memory pressure to * reduce or for the queue to drain. * Link request structure into the async list and * wakeup async thread to do the i/o. * nfs4_async_inactive - hand off a VOP_INACTIVE call to a thread. The * reference to the vnode is handed over to the thread; the caller should * no longer refer to the vnode. * Unlike most of the async routines, this handoff is needed for * correctness reasons, not just performance. So doing operations in the * context of the current thread is not an option. * Note that we don't check mi->mi_max_threads here, since we * *need* to get rid of this vnode regardless of whether someone * set nfs4_max_threads to zero in /etc/system. * The manager thread knows about this and is willing to create * at least one thread to accomodate us. * We just need to free up the memory associated with the * vnode, which can be safely done from within the current * No need to explicitly throw away any cached pages. The * eventual r4inactive() will attempt a synchronous * VOP_PUTPAGE() which will immediately fail since the request * is coming from the wrong zone, and then will proceed to call * nfs4_invalidate_pages() which will clean things up for us. * Throw away the delegation here so rp4_addfree()'s attempt to * return any existing delegations becomes a no-op. * We want to talk to the inactive thread. * Enqueue the vnode and wake up either the special thread (empty * list) or an async thread. * Move bytes in at most PAGESIZE chunks. We must avoid * spanning pages in uiomove() because page faults may cause * the cache to be invalidated out from under us. The r_size is not * updated until after the uiomove. If we push the last page of a * file before r_size is correct, we will lose the data written past * the current (and invalid) r_size. * n is the number of bytes required to satisfy the request * or the number of bytes to fill out the page. * Check to see if we can skip reading in the page * and just allocate the memory. We can do this * if we are going to rewrite the entire mapping * or if we are going to write to or beyond the current * end of file from the beginning of the mapping. * The read of r_size is now protected by r_statelock. * When pgcreated is nonzero the caller has already done * a segmap_getmapflt with forcefault 0 and S_WRITE. With * segkpm this means we already have at least one page * created and mapped at base. * The last argument tells segmap_pagecreate() to * always lock the page, as opposed to sometimes * returning with the page locked. This way we avoid a * fault on the ensuing uiomove(), but also * more importantly (to fix bug 1094402) we can * call segmap_fault() to unlock the page in all * cases. An alternative would be to modify * segmap_pagecreate() to tell us when it is * locking a page, but that's a fairly major * The number of bytes of data in the last page can not * be accurately be determined while page is being * uiomove'd to and the size of the file being updated. * Thus, inform threads which need to know accurately * how much data is in the last page of the file. They * will not do the i/o immediately, but will arrange for * the i/o to happen later when this modify operation * Copy data. If new pages are created, part of * the page that is not written will be initizliazed * r_size is the maximum number of * bytes known to be in the file. * Make sure it is at least as high as the * first unwritten byte pointed to by uio_loffset. /* n = # of bytes written */ * If we created pages w/o initializing them completely, * we need to zero the part that wasn't set up. * This happens on a most EOF write cases and if * we had some sort of error during the uiomove. * Caller is responsible for this page, * it was not created in this loop. * For bug 1094402: segmap_pagecreate locks * page. Unlock it. This also unlocks the * pages allocated by page_create_va() in * If R4OUTOFSPACE is set, then all writes turn into B_INVAL * writes. B_FORCE is set to force the VM system to actually * invalidate the pages, even if the i/o failed. The pages * need to get invalidated because they can't be written out * because there isn't any space left on either the server's * file system or in the user's disk quota. The B_FREE bit * is cleared to avoid confusion as to whether this is a * request to place the page on the freelist or to destroy * If doing a full file synchronous operation, then clear * the R4DIRTY bit. If a page gets dirtied while the flush * is happening, then R4DIRTY will get set again. The * R4DIRTY bit must get cleared before the flush so that * we don't lose this information. * If there are no full file async write operations * pending and RDIRTY bit is set, clear it. * Search the entire vp list for pages >= off, and flush * If an error occured and the file was marked as dirty * before and we aren't forcibly invalidating pages, then * reset the R4DIRTY flag. * Do a range from [off...off + len) looking for pages * If we are not invalidating, synchronously * freeing or writing pages use the routine * page_lookup_nowait() to prevent reclaiming * them from the free list. * "io_off" and "io_len" are returned as * the range of pages we actually wrote. * This allows us to skip ahead more quickly * since several pages may've been dealt * with by this iteration of the loop. /* this is a read-only kstat. Bail out on a write */ * We don't want to wait here as kstat_chain_lock could be held by * dounmount(). dounmount() takes vfs_reflock before the chain lock * and thus could lead to a deadlock. * The sv_secdata holds the flavor the client specifies. * If the client uses default and a security negotiation * occurs, sv_currsec will point to the current flavor * selected from the server flavor list. * sv_currsec is NULL if no security negotiation takes place. * PSARC 2001/697 Contract Private Interface * All nfs kstats are under SunMC contract * Please refer to the PSARC listed above and contact * SunMC before making any changes! * Changes must be reviewed by Solaris File Sharing * Changes must be communicated to contract-2001-697@sun.com * In case of forced unmount, do not print any messages * since it can flood the console with error messages. * If the mount point is dead, not recoverable, do not * print error messages that can flood the console. * No use in flooding the console with ENOSPC * messages from the same file system. "^File: userid=%d, groupid=%d\n",
"^User: userid=%d, groupid=%d\n",
"nfs_bio: cred is%s kcred\n",
* Return non-zero if the given file can be safely memory mapped. Locks * are safe if whole-file (length and offset are both zero). * Review all the locks for the vnode, both ones that have been * acquired and ones that are pending. We assume that * flk_active_locks_for_vp() has merged any locks that can be * merged (so that if a process has the entire file locked, it is * represented as a single lock). * Note that we can't bail out of the loop if we find a non-safe * lock, because we have to free all the elements in the llp list. * We might be able to speed up this code slightly by not looking * at each lock's l_start and l_len fields once we've found a "nfs4_safemap: unsafe active lock (%" PRId64 safe ?
"safe" :
"unsafe"));
* Return whether there is a lost LOCK or LOCKU queued up for the given * file that would make an mmap request unsafe. cf. nfs4_safemap(). continue;
/* different file */ * If the vnode has a lock that makes it unsafe to cache the file, mark it * as non cachable (set VNOCACHE bit). * The cached attributes of the file are stale after acquiring * the lock on the file. They were updated when the file was * opened, but not updated when the lock was acquired. Therefore the * cached attributes are invalidated after the lock is obtained. * Callback routine to tell all NFSv4 mounts in the zone to start tearing down * state and killing off threads. "nfs4_mi_shutdown zone %d\n",
zoneid));
"nfs4_mi_shutdown stopping vfs %p\n", (
void *)
mi->
mi_vfsp));
* purge the DNLC for this filesystem * Tell existing async worker threads to exit. * Set the appropriate flags, signal and wait for both the * async manager and the inactive thread to exit when they're * done with their current work. * Wait for the inactive thread to exit. * Wait for the recovery thread to complete, that is, it will * signal when it is done using the "mi" structure and about * We're done when every mi has been done or the list is empty. * This one is done, remove it from the list. * Release hold on vfs and mi done to prevent race with zone * shutdown. This releases the hold in nfs4_mi_zonelist_add. * Tell each renew thread in the zone to exit * We add another hold onto the nfs4_server_t * because this will make sure tha the nfs4_server_t * stays around until nfs4_callback_fini_zone destroys * the zone. This way, the renew thread can * unconditionally release its holds on the "nfs4_mi_destroy zone %d\n",
zoneid));
/* Still waiting for VFS_FREEVFS() */ * Add an NFS mount to the per-zone list of NFS mounts. * hold added to eliminate race with zone shutdown -this will be * released in mi_shutdown * Remove an NFS mount from the per-zone list of NFS mounts. /* if this mi is marked dead, then the zone already released it */ /* release the holds put on in zonelist_add(). */ * We can be called asynchronously by VFS_FREEVFS() after the zone * Destroy the oo hash lists and mutexes for the cred hash table. /* Destroy any remaining open owners on the list */ * Empty and destroy the freed open owner list. * Add a CPR callback so that we can update client * lease after a suspend and resume. * Initialise the reference count of the notsupp xattr cache vnode to 1 * so that it never goes away (VOP_INACTIVE isn't called on it). * We get called for Suspend and Resume events. * For the suspend case we simply don't care! * When we get to here we are in the process of * resuming the system from a previous suspend. "nfs4_renew_lease_thread: acting on sp 0x%p", (
void*)
sp));
/* sp->s_lease_time is set via a GETATTR */ "nfs4_renew_lease_thread: no renew : thread " "nfs4_renew_lease_thread: no renew : " "state_ref_count %d, lease_valid %d",
"nfs4_renew_lease_thread: no renew: " "nfs4_renew_lease_thread: tmp_time %ld, " "nfs4_renew_lease_thread: valid lease: sleep for %ld " "nfs4_renew_lease_thread: valid lease: time left %ld :" "sp last_renewal_time %ld, nfs4_client_resumed %ld, " * Issue RENEW op since we haven't renewed the lease * Need to re-acquire sp's lock, nfs4renew() * See if someone changed s_thread_exit while we gave * check to see if we implicitly renewed while * we waited for a reply for our RENEW call. /* no implicit renew came */ "implicit renewal before reply " "from server for RENEW"));
"renew_thread: nfs4renew returned error" "nfs4_renew_lease_thread: thread exiting"));
"nfs4_renew_lease_thread: waiting for outstanding " "otw calls to finish for sp 0x%p, current " "s_otw_call_count %d", (
void *)
sp,
"nfs4_renew_lease_thread: renew thread exit officially"));
* Send out a RENEW op to the server. * Assumes sp is locked down. /* Check to see if we're dealing with a marked-dead sp */ /* Make sure mi hasn't changed on us */ /* Must drop sp's lock to avoid a recursive mutex enter */ /* used to figure out RTT for sp */ "nfs4renew: %s call, sp 0x%p",
needrecov ?
"recov" :
"first",
* If the server returns CB_PATH_DOWN, it has renewed * the lease and informed us that the callback path is * down. Since the lease is renewed, just return 0 and * let the renew thread proceed as normal. "nfs4renew: initiating recovery\n"));
/* fall through for res.status case */ * XXX need to try every mntinfo4 in sp->mntinfo4_list * to renew the lease on that server /* this locks down sp if it is found */ * Bump the number of OPEN files (ie: those with state) so we know if this * nfs4_server has any state to maintain a lease for or not. * Also, marks the nfs4_server's lease valid if it hasn't been done so already. "nfs4_inc_state_ref_count: state_ref_count now %d",
* If this call caused the lease to be marked valid and/or * took the state_ref_count from 0 to 1, then start the time /* update the number of open files for mi */ /* this locks down sp if it is found */ * Decrement the number of OPEN files (ie: those with state) so we know if * this nfs4_server has any state to maintain a lease for or not. "nfs4_dec_state_ref_count: state ref count now %d",
"nfs4_dec_state_ref_count: mi open files %d, v4 flags 0x%x",
/* We don't have to hold the mi_lock to test mi_flags */ "nfs4_dec_state_ref_count: remove mntinfo4 %p since " "we have closed the last open file", (
void*)
mi));
* Return non-zero if the given nfs4_server_t is going through recovery. * Compare two shared filehandle objects. Returns -1, 0, or +1, if the * first is less than, equal to, or greater than the second. * Create a table for shared filehandle objects. * Return a shared filehandle object for the given filehandle. The caller * is responsible for eventually calling sfh4_rele(). * We allocate the largest possible filehandle size because it's * not that big, and it saves us from possibly having to resize the /* free our speculative allocs */ * Return a shared filehandle object for the given filehandle. The caller * is responsible for eventually calling sfh4_rele(). * If there's already an object for the given filehandle, bump the * reference count and return it. Otherwise, create a new object * and add it to the AVL tree. "sfh4_get: found existing %p, new refcnt=%d",
* Get a reference to the given shared filehandle object. (
CE_NOTE,
"sfh4_hold %p, new refcnt=%d",
* Release a reference to the given shared filehandle object and null out "sfh4_rele %p, new refcnt=%d",
* Possibly the last reference, so get the lock for the table in * case it's time to remove the object from the table. "sfh4_rele %p, new refcnt=%d",
"sfh4_rele %p, last ref", (
void *)
sfh));
* Update the filehandle for the given shared filehandle object. * The basic plan is to remove the shared filehandle object from * the table, update it to have the new filehandle, then reinsert * XXX If there is already a shared filehandle object with the new * filehandle, we're in trouble, because the rnode code assumes * that there is only one shared filehandle object for a given * filehandle. So issue a warning (for read-write mounts only) * and don't try to re-insert the given object into the table. * Hopefully the given object will quickly go away and everyone * will use the new object. "duplicate filehandle detected");
* Copy out the current filehandle for the given shared filehandle object. * Print out the filehandle for the given shared filehandle object. * Compare 2 fnames. Returns -1 if the first is "less" than the second, 0 * if they're the same, +1 if the first is "greater" than the second. The * caller (or whoever's calling the AVL package) is responsible for * handling locking issues. * The AVL package wants +/-1, not arbitrary positive or negative * Get or create an fname with the given name, as a child of the given * fname. The caller is responsible for eventually releasing the reference * (fn_rele()). parent may be NULL. * If there's already an fname registered with the given name, bump * its reference count and return it. Otherwise, create a new one * and add it to the parent's AVL tree. "fn_get %p:%s, a new nfs4_fname_t!",
"fn_hold %p:%s, new refcnt=%d",
* Decrement the reference count of the given fname, and destroy it if its * reference count goes to zero. Nulls out the given pointer. "fn_rele %p:%s, new refcnt=%d",
"fn_rele %p:%s, last reference, deleting...",
* Recursivly fn_rele the parent. * Use goto instead of a recursive call to avoid stack overflow. * Returns the single component name of the given fname, in a MAXNAMELEN * string buffer, which the caller is responsible for freeing. Note that * the name may become invalid as a result of fn_move(). * This function, used only by fn_path, constructs * a new string which looks like "prepend" + "/" + "current". * by allocating a new string and freeing the old one. * Prime the pump, allocate just the * space for prepend and return that. * Allocate the space for a new string * +1 +1 is for the "/" and the NULL * byte at the end of it all. * Returns the path name (starting from the fs root) for the given fname. * The caller is responsible for freeing. Note that the path may be or * become invalid as a result of fn_move(). /* walk up the tree constructing the pathname. */ * Add fn_name in front of the current path * Return a reference to the parent of the given fname, which the caller is * responsible for eventually releasing. * Update fnp so that its parent is newparent and its name is newname. * This assert exists to catch the client trying to rename * a dir to be a child of itself. This happened at a recent * bakeoff against a 3rd party (broken) server which allowed * the rename to succeed. If it trips it means that: * a) the code in nfs4rename that detects this case is broken * b) the server is broken (since it allowed the bogus rename) * For non-DEBUG kernels, prepare for a recursive mutex_enter * panic below from: mutex_enter(&newparent->fn_lock); * Remove fnp from its current parent, change its name, then add it * This could be due to a file that was unlinked while * open, or perhaps the rnode is in the free list. Remove * it from newparent and let it go away on its own. The * contorted code is to deal with lock order issues and * Return non-zero if the type information makes sense for the given vnode.