1N/A * See the file LICENSE for redistribution information. 1N/A * Copyright (c) 1996, 1997, 1998 1N/A * Sleepycat Software. All rights reserved. 1N/A#
endif /* not lint */ 1N/A * Mpool sync function. 1N/A * We try and write the buffers in page order: it should reduce seeks 1N/A * by the underlying filesystem and possibly reduce the actual number 1N/A * of writes. We don't want to hold the region lock while we write 1N/A * the buffers, so only hold it lock while we create a list. Get a 1N/A * good-size block of memory to hold buffer pointers, we don't want 1N/A * If the application is asking about a previous call to memp_sync(), 1N/A * and we haven't found any buffers that the application holding the 1N/A * pin couldn't write, return yes or no based on the current count. 1N/A * Note, if the application is asking about a LSN *smaller* than one 1N/A * we've already handled or are currently handling, then we return a 1N/A * result based on the count for the larger LSN. 1N/A /* Else, it's a new checkpoint. */ 1N/A * Save the LSN. We know that it's a new LSN or larger than the one 1N/A * for which we were already doing a checkpoint. (BTW, I don't expect 1N/A * to see multiple LSN's from the same or multiple processes, but You 1N/A * Just Never Know. Responding as if they all called with the largest 1N/A * of the LSNs specified makes everything work.) 1N/A * We don't currently use the LSN we save. We could potentially save 1N/A * the last-written LSN in each buffer header and use it to determine 1N/A * what buffers need to be written. The problem with this is that it's 1N/A * sizeof(LSN) more bytes of buffer header. We currently write all the 1N/A * dirty buffers instead. 1N/A * Walk the list of shared memory segments clearing the count of 1N/A * buffers waiting to be written. 1N/A * Walk the list of buffers and mark all dirty buffers to be written 1N/A * and all pinned buffers to be potentially written (we can't know if 1N/A * we'll need to write them until the holding process returns them to 1N/A * the cache). We do this in one pass while holding the region locked 1N/A * so that processes can't make new buffers dirty, causing us to never 1N/A * finish. Since the application may have restarted the sync, clear 1N/A * any BH_WRITE flags that appear to be left over from previous calls. 1N/A * We don't want to pin down the entire buffer cache, otherwise we'll 1N/A * starve threads needing new pages. Don't pin down more than 80% of 1N/A * Keep a count of the total number of buffers we need to write in 1N/A * MPOOL->lsn_cnt, and for each file, in MPOOLFILE->lsn_count. 1N/A * If the buffer isn't in use, we should be able to 1N/A * write it immediately, so increment the reference 1N/A * count to lock it and its contents down, and then 1N/A * save a reference to it. 1N/A * If we've run out space to store buffer references, 1N/A * we're screwed. We don't want to realloc the array 1N/A * while holding a region lock, so we set the flag to 1N/A * force the checkpoint to be done again, from scratch, 1N/A * If we've pinned down too much of the cache stop, and 1N/A * set a flag to force the checkpoint to be tried again 1N/A /* If there no buffers we can write immediately, we're done. */ 1N/A /* Sort the buffers we're going to write. */ 1N/A /* Walk the array, writing buffers. */ 1N/A * It's possible for a thread to have gotten the buffer since 1N/A * we listed it for writing. If the reference count is still 1N/A * 1, we're the only ones using the buffer, go ahead and write. 1N/A * If it's >1, then skip the buffer and assume that it will be 1N/A * written when it's returned to the cache. 1N/A /* Write the buffer. */ 1N/A /* Release the buffer. */ 1N/A /* If there's an error, release the rest of the buffers. */ 1N/A * Any process syncing the shared memory buffer pool 1N/A * had better be able to write to any underlying file. 1N/A * Be understanding, but firm, on this point. 1N/A * MPOOL->lsn_cnt (the total sync count) 1N/A * MPOOLFILE->lsn_cnt (the per-file sync count) 1N/A * BH_WRITE flag (the scheduled for writing flag) 1N/A * Mpool file sync function. 1N/A * If this handle doesn't have a file descriptor that's open for 1N/A * writing, or if the file is a temporary, there's no reason to 1N/A * Return a file descriptor for DB 1.85 compatibility locking. 1N/A * PUBLIC: int __mp_xxx_fd __P((DB_MPOOLFILE *, int *)); 1N/A * This is a truly spectacular layering violation, intended ONLY to 1N/A * support compatibility for the DB 1.85 DB->fd call. 1N/A * Sync the database file to disk, creating the file as necessary. 1N/A * We skip the MP_READONLY and MP_TEMP tests done by memp_fsync(3). 1N/A * The MP_READONLY test isn't interesting because we will either 1N/A * already have a file descriptor (we opened the database file for 1N/A * reading) or we aren't readonly (we created the database which 1N/A * requires write privileges). The MP_TEMP test isn't interesting 1N/A * because we want to write to the backing file regardless so that 1N/A * we get a file descriptor to return. 1N/A * Mpool file internal sync function. 1N/A * We try and write the buffers in page order: it should reduce seeks 1N/A * by the underlying filesystem and possibly reduce the actual number 1N/A * of writes. We don't want to hold the region lock while we write 1N/A * the buffers, so only hold it lock while we create a list. Get a 1N/A * good-size block of memory to hold buffer pointers, we don't want 1N/A * Walk the LRU list of buffer headers, and get a list of buffers to 1N/A * write for this MPOOLFILE. 1N/A * If we've run out space to store buffer references, we're 1N/A * screwed, as we don't want to realloc the array holding a 1N/A * region lock. Set the incomplete flag -- the only way we 1N/A * can get here is if the file is active in the buffer cache, 1N/A * which is the same thing as finding pinned buffers. 1N/A /* Sort the buffers we're going to write. */ 1N/A /* Walk the array, writing buffers. */ 1N/A * It's possible for a thread to have gotten the buffer since 1N/A * we listed it for writing. If the reference count is still 1N/A * 1, we're the only ones using the buffer, go ahead and write. 1N/A * If it's >1, then skip the buffer. 1N/A /* Write the buffer. */ 1N/A /* Release the buffer. */ 1N/A /* If there's an error, release the rest of the buffers. */ 1N/A * If we didn't write the buffer for some reason, don't return 1N/A * Sync the underlying file as the last thing we do, so that the OS 1N/A * has maximal opportunity to flush buffers before we request it. 1N/A * Don't lock the region around the sync, fsync(2) has no atomicity 1N/A * Keep a specified percentage of the buffers clean. 1N/A * If there are sufficient clean buffers, or no buffers or no dirty 1N/A * buffers, we're done. 1N/A * Using st_page_clean and st_page_dirty is our only choice at the 1N/A * moment, but it's not as correct as we might like in the presence 1N/A * of pools with more than one buffer size, as a free 512-byte buffer 1N/A * isn't the same as a free 8K buffer. 1N/A /* Loop until we write a buffer. */ 1N/A * We can't write to temporary files -- see the comment in 1N/A * Any process syncing the shared memory buffer pool had better 1N/A * be able to write to any underlying file. Be understanding, 1N/A * but firm, on this point. 1N/A /* No more buffers to write. */ 1N/A /* Sort by file (shared memory pool offset). */ 1N/A * Defend against badly written quicksort code calling the comparison 1N/A * function with two identical pointers (e.g., WATCOM C++ (Power++)).