2N/A * Copyright 2005 Sun Microsystems, Inc. All rights reserved. 2N/A * Use is subject to license terms. 2N/A#
pragma ident "%Z%%M% %I% %E% SMI" 2N/A** The author disclaims copyright to this source code. In place of 2N/A** a legal notice, here is a blessing: 2N/A** May you do good and not evil. 2N/A** May you find forgiveness for yourself and forgive others. 2N/A** May you share freely, never taking more than you give. 2N/A************************************************************************* 2N/A** This is the implementation of the page cache subsystem or "pager". 2N/A** The pager is used to access a database disk file. It implements 2N/A** atomic commit and rollback through the use of a journal file that 2N/A** is separate from the database file. The pager also implements file 2N/A** locking to prevent two processes from writing the same database 2N/A** file simultaneously, or one process from reading the database while 2N/A** another is writing. 2N/A** @(#) $Id: pager.c,v 1.101 2004/02/25 02:20:41 drh Exp $ 2N/A#
include "os.h" /* Must be first to enable large file support */ 2N/A** Macros for troubleshooting. Normally turned off 2N/A** The page cache as a whole is always in one of the following 2N/A** SQLITE_UNLOCK The page cache is not currently reading or 2N/A** writing the database file. There is no 2N/A** data held in memory. This is the initial 2N/A** SQLITE_READLOCK The page cache is reading the database. 2N/A** Writing is not permitted. There can be 2N/A** multiple readers accessing the same database 2N/A** file at the same time. 2N/A** SQLITE_WRITELOCK The page cache is writing the database. 2N/A** Access is exclusive. No other processes or 2N/A** threads can be reading or writing while one 2N/A** process is writing. 2N/A** The page cache comes up in SQLITE_UNLOCK. The first time a 2N/A** sqlite_page_get() occurs, the state transitions to SQLITE_READLOCK. 2N/A** After all pages have been released using sqlite_page_unref(), 2N/A** the state transitions back to SQLITE_UNLOCK. The first time 2N/A** that sqlite_page_write() is called, the state transitions to 2N/A** SQLITE_WRITELOCK. (Note that sqlite_page_write() can only be 2N/A** called on an outstanding page which means that the pager must 2N/A** be in SQLITE_READLOCK before it transitions to SQLITE_WRITELOCK.) 2N/A** The sqlite_page_rollback() and sqlite_page_commit() functions 2N/A** transition the state from SQLITE_WRITELOCK back to SQLITE_READLOCK. 2N/A** Each in-memory image of a page begins with the following header. 2N/A** This header is only visible to this pager module. The client 2N/A** code that calls pager sees only the data that follows the header. 2N/A** Client code should call sqlitepager_write() on a page prior to making 2N/A** any modifications to that page. The first time sqlitepager_write() 2N/A** is called, the original page contents are written into the rollback 2N/A** journal and PgHdr.inJournal and PgHdr.needSync are set. Later, once 2N/A** the journal page has made it onto the disk surface, PgHdr.needSync 2N/A** is cleared. The modified page cannot be written back into the original 2N/A** database file until the journal pages has been synced to disk and the 2N/A** PgHdr.needSync has been cleared. 2N/A** The PgHdr.dirty flag is set when sqlitepager_write() is called and 2N/A** is cleared again when the page content is written back to the original 2N/A int nRef;
/* Number of users of this page */ 2N/A u8 inCkpt;
/* TRUE if written to the checkpoint journal */ 2N/A u8 dirty;
/* TRUE if we need to write back changes */ 2N/A /* SQLITE_PAGE_SIZE bytes of page data follow this header */ 2N/A /* Pager.nExtra bytes of local data follow the page data */ 2N/A** A macro used for invoking the codec if there is one 2N/A** Convert a pointer to a PgHdr into a pointer to its data 2N/A** How big to make the hash table used for locating in-memory pages 2N/A** Hash a page number 2N/A** A open page cache is an instance of the following structure. 2N/A char *
zDirectory;
/* Directory hold database and journal files */ 2N/A int dbSize;
/* Number of pages in the file */ 2N/A int ckptSize;
/* Size of database (in pages) at ckpt_begin() */ 2N/A int nRec;
/* Number of pages written to the journal */ 2N/A int ckptNRec;
/* Number of records in the checkpoint journal */ 2N/A int nExtra;
/* Add this many bytes to each in-memory page */ 2N/A void (*
xDestructor)(
void*);
/* Call this routine when freeing pages */ 2N/A int nPage;
/* Total number of in-memory pages */ 2N/A int nRef;
/* Number of in-memory pages with PgHdr.nRef>0 */ 2N/A int mxPage;
/* Maximum number of pages to hold in cache */ 2N/A u8 state;
/* SQLITE_UNLOCK, _READLOCK or _WRITELOCK */ 2N/A** These are bits that can be set in Pager.errMask. 2N/A** The journal file contains page records in the following 2N/A** Actually, this structure is the complete page record for pager 2N/A** formats less than 3. Beginning with format 3, this record is surrounded 2N/A** Journal files begin with the following magic string. The data 2N/A** was obtained from /dev/random. It is used only as a sanity check. 2N/A** There are three journal formats (so far). The 1st journal format writes 2N/A** 32-bit integers in the byte-order of the host machine. New 2N/A** formats writes integers as big-endian. All new journals use the 2N/A** new format, but we have to be able to read an older journal in order 2N/A** to rollback journals created by older versions of the library. 2N/A** The 3rd journal format (added for 2.8.0) adds additional sanity 2N/A** checking information to the journal. If the power fails while the 2N/A** journal is being written, semi-random garbage data might appear in 2N/A** the journal file after power is restored. If an attempt is then made 2N/A** to roll the journal back, the database could be corrupted. The additional 2N/A** sanity checking data is an attempt to discover the garbage in the 2N/A** journal and ignore it. 2N/A** The sanity checking information for the 3rd journal format consists 2N/A** of a 32-bit checksum on each page of data. The checksum covers both 2N/A** the page number and the SQLITE_PAGE_SIZE bytes of data for the page. 2N/A** This cksum is initialized to a 32-bit random value that appears in the 2N/A** journal file right after the header. The random initializer is important, 2N/A** because garbage data that appears at the end of a journal is likely 2N/A** data that was once in other files that have now been deleted. If the 2N/A** garbage data came from an obsolete journal file, the checksums might 2N/A** be correct. But by initializing the checksum to random value which 2N/A** is different for every journal, we minimize that risk. 2N/A 0xd9,
0xd5,
0x05,
0xf9,
0x20,
0xa1,
0x63,
0xd4,
2N/A 0xd9,
0xd5,
0x05,
0xf9,
0x20,
0xa1,
0x63,
0xd5,
2N/A 0xd9,
0xd5,
0x05,
0xf9,
0x20,
0xa1,
0x63,
0xd6,
2N/A** The following integer determines what format to use when creating 2N/A** new primary journal files. By default we always use format 3. 2N/A** When testing, we can set this value to older journal formats in order to 2N/A** make sure that newer versions of the library are able to rollback older 2N/A** Note that checkpoint journals always use format 2 and omit the header. 2N/A** The size of the header and of each page in the journal varies according 2N/A** to which journal format is being used. The following macros figure out 2N/A** the sizes based on format numbers. 2N/A** Enable reference count tracking here: 2N/A "REFCNT: %4d addr=0x%08x nRef=%d\n",
2N/A cnt++;
/* Something to set a breakpoint on */ 2N/A** Read a 32-bit integer from the given file descriptor. Store the integer 2N/A** that is read in *pRes. Return SQLITE_OK if everything worked, or an 2N/A** error code is something goes wrong. 2N/A** If the journal format is 2 or 3, read a big-endian integer. If the 2N/A** journal format is 1, read an integer in the native byte-order of the 2N/A** Write a 32-bit integer into the given file descriptor. Return SQLITE_OK 2N/A** on success or an error code is something goes wrong. 2N/A** If the journal format is 2 or 3, write the integer as 4 big-endian 2N/A** bytes. If the journal format is 1, write the integer in the native 2N/A** byte order. In normal operation, only formats 2 and 3 are used. 2N/A** Journal format 1 is only used for testing. 2N/A** Write a 32-bit integer into a page header right before the 2N/A** page data. This will overwrite the PgHdr.pDirty pointer. 2N/A** The integer is big-endian for formats 2 and 3 and native byte order 2N/A** for journal format 1. 2N/A** Convert the bits in the pPager->errMask into an approprate 2N/A** Add or remove a page from the list of all pages that are in the 2N/A** checkpoint journal. 2N/A** The Pager keeps a separate list of pages that are currently in 2N/A** the checkpoint journal. This helps the sqlitepager_ckpt_commit() 2N/A** routine run MUCH faster for the common case where there are many 2N/A** pages in memory but only a few are in the checkpoint journal. 2N/A** Find a page in the hash table given its page number. Return 2N/A** a pointer to the page or NULL if not found. 2N/A** Unlock the database and clear the in-memory cache. This routine 2N/A** sets the state of the pager back to what it was when it was first 2N/A** opened. Any outstanding pages are invalidated and subsequent attempts 2N/A** to access those pages will likely result in a coredump. 2N/A** When this routine is called, the pager has the journal file open and 2N/A** a write lock on the database. This routine releases the database 2N/A** write lock and acquires a read lock in its place. The journal file 2N/A** is deleted and closed. 2N/A** TODO: Consider keeping the journal file open for temporary databases. 2N/A** This might give a performance improvement on windows where opening 2N/A** a file is an expensive operation. 2N/A /* This can only happen if a process does a BEGIN, then forks and the 2N/A ** child process does the COMMIT. Because of the semantics of unix 2N/A ** file locking, the unlock will fail. 2N/A** Compute and return a checksum for the page of data. 2N/A** This is not a real checksum. It is really just the sum of the 2N/A** random initial value and the page number. We considered do a checksum 2N/A** of the database, but that was found to be too slow. 2N/A** Read a single page from the journal file opened on file descriptor 2N/A** jfd. Playback this one page. 2N/A** There are three different journal formats. The format parameter determines 2N/A** which format is used by the journal that is played back. 2N/A /* Sanity checking on the page. This is more important that I originally 2N/A ** thought. If a power failure occurs while the journal is being written, 2N/A ** it could cause invalid data to be written into the journal. We need to 2N/A ** detect this invalid data (with high probability) and ignore it. 2N/A /* Playback the page. Update the in-memory copy of the page 2N/A ** at the same time, if there is one. 2N/A /* No page should ever be rolled back that is in use, except for page 2N/A ** 1 which is held in use in order to keep the lock on the database 2N/A ** active. However, such a page may be rolled back as a result of an 2N/A ** internal error resulting in an automatic call to 2N/A ** sqlitepager_rollback(), so we can't assert() it. 2N/A /* assert( pPg->nRef==0 || pPg->pgno==1 ) */ 2N/A** Playback the journal and thus restore the database file to 2N/A** the state it was in before we started making changes. 2N/A** The journal file format is as follows: 2N/A** * 8 byte prefix. One of the aJournalMagic123 vectors defined 2N/A** above. The format of the journal file is determined by which 2N/A** of the three prefix vectors is seen. 2N/A** * 4 byte big-endian integer which is the number of valid page records 2N/A** in the journal. If this value is 0xffffffff, then compute the 2N/A** number of page records from the journal size. This field appears 2N/A** * 4 byte big-endian integer which is the initial value for the 2N/A** sanity checksum. This field appears in format 3 only. 2N/A** * 4 byte integer which is the number of pages to truncate the 2N/A** database to during a rollback. 2N/A** * Zero or more pages instances, each as follows: 2N/A** + 4 byte page number. 2N/A** + SQLITE_PAGE_SIZE bytes of data. 2N/A** + 4 byte checksum (format 3 only) 2N/A** When we speak of the journal header, we mean the first 4 bullets above. 2N/A** Each entry in the journal is an instance of the 5th bullet. Note that 2N/A** bullets 2 and 3 only appear in format-3 journals. 2N/A** Call the value from the second bullet "nRec". nRec is the number of 2N/A** valid page entries in the journal. In most cases, you can compute the 2N/A** value of nRec from the size of the journal file. But if a power 2N/A** failure occurred while the journal was being written, it could be the 2N/A** case that the size of the journal file had already been increased but 2N/A** the extra entries had not yet made it safely to disk. In such a case, 2N/A** the value of nRec computed from the file size would be too large. For 2N/A** that reason, we always use the nRec value in the header. 2N/A** If the nRec value is 0xffffffff it means that nRec should be computed 2N/A** from the file size. This value is used when the user selects the 2N/A** no-sync option for the journal. A power failure could lead to corruption 2N/A** in this case. But for things like temporary table (which will be 2N/A** deleted when the power is restored) we don't care. 2N/A** Journal formats 1 and 2 do not have an nRec value in the header so we 2N/A** have to compute nRec from the file size. This has risks (as described 2N/A** above) which is why all persistent tables have been changed to use 2N/A** If the file opened as the journal file is not a well-formed 2N/A** journal file then the database will likely already be 2N/A** corrupted, so the PAGER_ERR_CORRUPT bit is set in pPager->errMask 2N/A** and SQLITE_CORRUPT is returned. If it all works, then this routine 2N/A** returns SQLITE_OK. 2N/A int nRec;
/* Number of Records in the journal */ 2N/A int i;
/* Loop counter */ 2N/A Pgno mxPg = 0;
/* Size of the original file in pages */ 2N/A int format;
/* Format of the journal file. */ 2N/A /* Figure out how many records are in the journal. Abort early if 2N/A ** the journal is empty. 2N/A /* If the journal file is too small to contain a complete header, 2N/A ** it must mean that the process that created the journal was just 2N/A ** beginning to write the journal file when it died. In that case, 2N/A ** the database file should have still been completely unchanged. 2N/A ** Nothing needs to be rolled back. We can safely ignore this journal. 2N/A /* Read the beginning of the journal and truncate the 2N/A ** database file back to its original size. 2N/A /* Ignore the journal if it is too small to contain a complete 2N/A ** header. We already did this test once above, but at the prior 2N/A ** test, we did not know the journal format and so we had to assume 2N/A ** the smallest possible header. Now we know the header is bigger 2N/A ** than the minimum so we test again. 2N/A /* Copy original pages out of the journal and back into the database file. 2N/A /* Pages that have been written to the journal but never synced 2N/A ** where not restored by the loop above. We have to restore those 2N/A ** pages by reading them back from the original database. 2N/A** Playback the checkpoint journal. 2N/A** This is similar to playing back the transaction journal but with 2N/A** a few extra twists. 2N/A** (1) The number of pages in the database file at the start of 2N/A** the checkpoint is stored in pPager->ckptSize, not in the 2N/A** journal file itself. 2N/A** (2) In addition to playing back the checkpoint journal, also 2N/A** playback all pages of the transaction journal beginning 2N/A** at offset pPager->ckptJSize. 2N/A int i;
/* Loop counter */ 2N/A /* Truncate the database back to its original size. 2N/A /* Figure out how many records are in the checkpoint journal. 2N/A /* Copy original pages out of the checkpoint journal and back into the 2N/A ** database file. Note that the checkpoint journal always uses format 2N/A ** 2 instead of format 3 since it does not need to be concerned with 2N/A ** power failures corrupting the journal and can thus omit the checksums. 2N/A /* Figure out how many pages need to be copied out of the transaction 2N/A** Change the maximum number of in-memory pages that are allowed. 2N/A** The maximum number is the absolute value of the mxPage parameter. 2N/A** If mxPage is negative, the noSync flag is also set. noSync bypasses 2N/A** calls to sqliteOsSync(). The pager runs much faster with noSync on, 2N/A** but if the operating system crashes or there is an abrupt power 2N/A** failure, the database file might be left in an inconsistent and 2N/A** unrepairable state. 2N/A** Adjust the robustness of the database to damage due to OS crashes 2N/A** or power failures by changing the number of syncs()s when writing 2N/A** the rollback journal. There are three levels: 2N/A** OFF sqliteOsSync() is never called. This is the default 2N/A** for temporary and transient files. 2N/A** NORMAL The journal is synced once before writes begin on the 2N/A** database. This is normally adequate protection, but 2N/A** it is theoretically possible, though very unlikely, 2N/A** that an inopertune power failure could leave the journal 2N/A** in a state which would cause damage to the database 2N/A** when it is rolled back. 2N/A** FULL The journal is synced twice before writes begin on the 2N/A** database (with some additional information - the nRec field 2N/A** of the journal header - being written in between the two 2N/A** syncs). If we assume that writing a 2N/A** single disk sector is atomic, then this mode provides 2N/A** assurance that the journal will not be corrupted to the 2N/A** point of causing damage to the database during rollback. 2N/A** Numeric values associated with these states are OFF==1, NORMAL=2, 2N/A** Open a temporary file. Write the name of the file into zName 2N/A** (zName must be at least SQLITE_TEMPNAME_SIZE bytes long.) Write 2N/A** the file descriptor into *fd. Return SQLITE_OK on success or some 2N/A** other error code if we fail. 2N/A** The OS will automatically delete the temporary file when it is 2N/A** Create a new page cache and put a pointer to the page cache in *ppPager. 2N/A** The file to be cached need not exist. The file is not locked until 2N/A** the first call to sqlitepager_get() and is only held open until the 2N/A** last page is released using sqlitepager_unref(). 2N/A** If zFilename is NULL then a randomly-named temporary file is created 2N/A** and used as the file to be cached. The file will be deleted 2N/A** automatically when it is closed. 2N/A const char *
zFilename,
/* Name of the database file to open */ 2N/A int mxPage,
/* Max number of in-memory cache pages */ 2N/A int nExtra,
/* Extra bytes append to each in-memory page */ 2N/A** Set the destructor for this pager. If not NULL, the destructor is called 2N/A** when the reference count on each page reaches zero. The destructor can 2N/A** be used to clean up information in the extra segment appended to each page. 2N/A** The destructor is not called as a result sqlitepager_close(). 2N/A** Destructors are only called by sqlitepager_unref(). 2N/A** Return the total number of pages in the disk file associated with 2N/A** Forward declaration 2N/A** Truncate the file to the number of pages specified. 2N/A** Shutdown the page cache. Free all memory and close all files. 2N/A** If a transaction was in progress when this routine is called, that 2N/A** transaction is rolled back. All outstanding pages are invalidated 2N/A** and their memory is freed. Any attempt to use a page associated 2N/A** with this page cache after this function returns will likely 2N/A** result in a coredump. 2N/A /* Temp files are automatically deleted by the OS 2N/A ** if( pPager->tempFile ){ 2N/A ** sqliteOsDelete(pPager->zFilename); 2N/A** Return the page number for the given page data. 2N/A** Increment the reference count for a page. If the page is 2N/A** currently on the freelist (the reference count is zero) then 2N/A** remove it from the freelist. 2N/A /* The page is currently on the freelist. Remove it. */ 2N/A** Increment the reference count for a page. The input pointer is 2N/A** a reference to the page data. 2N/A** Sync the journal. In other words, make sure all the pages that have 2N/A** been written to the journal have actually reached the surface of the 2N/A** disk. It is not safe to modify the original database file until after 2N/A** the journal has been synced. If the original database is modified before 2N/A** the journal is synced and a power failure occurs, the unsynced journal 2N/A** data would be lost and we would be unable to completely rollback the 2N/A** database changes. Database corruption would occur. 2N/A** This routine also updates the nRec field in the header of the journal. 2N/A** (See comments on the pager_playback() routine for additional information.) 2N/A** If the sync mode is FULL, two syncs will occur. First the whole journal 2N/A** is synced, then the nRec field is updated, then a second sync occurs. 2N/A** For temporary databases, we do not care if we are able to rollback 2N/A** after a power failure, so sync occurs. 2N/A** This routine clears the needSync field of every page current held in 2N/A /* Sync the journal before modifying the main database 2N/A ** (assuming there is a journal and it needs to be synced.) 2N/A /* assert( !pPager->noSync ); // noSync might be set if synchronous 2N/A ** was turned off after the transaction was started. Ticket #615 */ 2N/A /* Make sure the pPager->nRec counter we are keeping agrees 2N/A ** with the nRec computed from the size of the journal file. 2N/A /* Write the nRec value into the journal file header */ 2N/A /* Erase the needSync flag from every page. 2N/A /* If the Pager.needSync flag is clear then the PgHdr.needSync 2N/A ** flag must also be clear for all pages. Verify that this 2N/A ** invariant is true. 2N/A** Given a list of pages (connected by the PgHdr.pDirty pointer) write 2N/A** every one of those pages out to the database file and mark them all 2N/A** Collect every dirty page into a dirty list and 2N/A** return a pointer to the head of that list. All pages are 2N/A** collected even if they are still in use. 2N/A** A read lock on the disk file is obtained when the first page is acquired. 2N/A** This read lock is dropped when the last page is released. 2N/A** A _get works for any page number greater than 0. If the database 2N/A** file is smaller than the requested page, then no actual disk 2N/A** read occurs and the memory image of the page is initialized to 2N/A** all zeros. The extra data appended to a page is always initialized 2N/A** to zeros the first time a page is loaded into memory. 2N/A** The acquisition might fail for several reasons. In all cases, 2N/A** an appropriate error code is returned and *ppPage is set to NULL. 2N/A** See also sqlitepager_lookup(). Both this routine and _lookup() attempt 2N/A** to find a page in the in-memory cache first. If the page is not already 2N/A** in memory, this routine goes to disk to read it in whereas _lookup() 2N/A** just returns 0. This routine acquires a read-lock the first time it 2N/A** has to go to disk, and could also playback an old journal if necessary. 2N/A** Since _lookup() never goes to disk, it never has to deal with locks 2N/A /* Make sure we have not hit any critical errors. 2N/A /* If this is the first page accessed, then get a read lock 2N/A ** on the database file. 2N/A /* If a journal file exists, try to play it back. 2N/A /* Get a write lock on the database 2N/A /* This should never happen! */ 2N/A /* Open the journal for reading only. Return SQLITE_BUSY if 2N/A ** we are unable to open the journal file. 2N/A ** The journal file does not need to be locked itself. The 2N/A ** journal file is never open unless the main database file holds 2N/A ** a write lock, so there is never any chance of two or more 2N/A ** processes opening the journal at the same time. 2N/A /* Playback and delete the journal. Drop the database write 2N/A ** lock and reacquire the read lock. 2N/A /* Search for page in cache */ 2N/A /* The requested page is not in the page cache. */ 2N/A /* Create a new page */ 2N/A /* Find a page to recycle. Try to locate a page that does not 2N/A ** require us to do an fsync() on the journal. 2N/A /* If we could not find a page that does not require an fsync() 2N/A ** on the journal file then fsync the journal file. This is a 2N/A ** very slow operation, so we work hard to avoid it. But sometimes 2N/A ** it can't be helped. 2N/A /* Write the page to the database file if it is dirty. 2N/A /* If the page we are recycling is marked as alwaysRollback, then 2N/A ** set the global alwaysRollback flag, thus disabling the 2N/A ** sqlite_dont_rollback() optimization for the rest of this transaction. 2N/A ** It is necessary to do this because the page marked alwaysRollback 2N/A ** might be reloaded at a later time but at that point we won't remember 2N/A ** that is was marked alwaysRollback. This means that all pages must 2N/A ** be marked as alwaysRollback from here on out. 2N/A /* Unlink the old page from the free list and the hash table 2N/A /* The requested page is in the page cache. */ 2N/A** Acquire a page if it is already in the in-memory cache. Do 2N/A** not read the page from disk. Return a pointer to the page, 2N/A** or 0 if the page is not in cache. 2N/A** See also sqlitepager_get(). The difference between this routine 2N/A** and sqlitepager_get() is that _get() will go to the disk and read 2N/A** in the page if the page is not already in cache. This routine 2N/A** returns NULL if the page is not in cache or if a disk I/O error 2N/A** has ever happened. 2N/A /* if( pPager->nRef==0 ){ 2N/A** If the number of references to the page drop to zero, then the 2N/A** page is added to the LRU list. When all references to all pages 2N/A** are released, a rollback occurs and the lock on the database is 2N/A /* Decrement the reference count for this page 2N/A /* When the number of references to a page reach 0, call the 2N/A ** destructor and add the page to the freelist. 2N/A /* When all pages reach the freelist, drop the read lock from 2N/A ** the database file. 2N/A** Create a journal file for pPager. There should already be a write 2N/A** lock on the database file when this routine is called. 2N/A** Return SQLITE_OK if everything. Return an error code and release the 2N/A** write lock if anything goes wrong. 2N/A** Acquire a write-lock on the database. The lock is removed when 2N/A** the any of the following happen: 2N/A** * sqlitepager_commit() is called. 2N/A** * sqlitepager_rollback() is called. 2N/A** * sqlitepager_close() is called. 2N/A** * sqlitepager_unref() is called to on every outstanding page. 2N/A** The parameter to this routine is a pointer to any open page of the 2N/A** database file. Nothing changes about the page - it is used merely 2N/A** to acquire a pointer to the Pager structure and as proof that there 2N/A** is already a read-lock on the database. 2N/A** A journal file is opened if this is not a temporary file. For 2N/A** temporary files, the opening of the journal file is deferred until 2N/A** there is an actual need to write to the journal. 2N/A** If the database is already write-locked, this routine is a no-op. 2N/A** Mark a data page as writeable. The page is written into the journal 2N/A** if it is not there already. This routine must be called before making 2N/A** changes to a page. 2N/A** The first time this routine is called, the pager creates a new 2N/A** journal and acquires a write lock on the database. If the write 2N/A** lock could not be acquired, this routine returns SQLITE_BUSY. The 2N/A** calling routine must check for that return value and be careful not to 2N/A** change any page data until this routine returns SQLITE_OK. 2N/A** If the journal file could not be written because the disk is full, 2N/A** then this routine returns SQLITE_FULL and does an immediate rollback. 2N/A** All subsequent write attempts also return SQLITE_FULL until there 2N/A** is a call to sqlitepager_commit() or sqlitepager_rollback() to 2N/A /* Mark the page as dirty. If the page has already been written 2N/A ** to the journal then we can return right away. 2N/A /* If we get this far, it means that the page needs to be 2N/A ** written to the transaction journal or the ckeckpoint journal 2N/A ** First check to see that the transaction journal exists and 2N/A ** create it if it does not. 2N/A /* The transaction journal now exists and we have a write lock on the 2N/A ** main database file. Write the current page to the transaction 2N/A ** journal if it is not there already. 2N/A /* If the checkpoint journal is open and the page is not in it, 2N/A ** then write the current page to the checkpoint journal. Note that 2N/A ** the checkpoint journal always uses the simplier format 2 that lacks 2N/A ** checksums. The header is also omitted from the checkpoint journal. 2N/A /* Update the database size and return. 2N/A** Return TRUE if the page given in the argument was previously passed 2N/A** to sqlitepager_write(). In other words, return TRUE if it is ok 2N/A** to change the content of the page. 2N/A** Replace the content of a single page with the information in the third 2N/A** A call to this routine tells the pager that it is not necessary to 2N/A** write the information on page "pgno" back to the disk, even though 2N/A** that page might be marked as dirty. 2N/A** The overlying software layer calls this routine when all of the data 2N/A** on the given page is unused. The pager marks the page as clean so 2N/A** that it does not get written to disk. 2N/A** Tests show that this optimization, together with the 2N/A** sqlitepager_dont_rollback() below, more than double the speed 2N/A** of large INSERT operations and quadruple the speed of large DELETEs. 2N/A** When this routine is called, set the alwaysRollback flag to true. 2N/A** Subsequent calls to sqlitepager_dont_rollback() for the same page 2N/A** will thereafter be ignored. This is necessary to avoid a problem 2N/A** where a page with data is added to the freelist during one part of 2N/A** a transaction then removed from the freelist during a later part 2N/A** of the same transaction and reused for some other purpose. When it 2N/A** is first added to the freelist, this routine is called. When reused, 2N/A** the dont_rollback() routine is called. But because the page contains 2N/A** critical data, we still need to be sure it gets rolled back in spite 2N/A** of the dont_rollback() call. 2N/A /* If this pages is the last page in the file and the file has grown 2N/A ** during the current transaction, then do NOT mark the page as clean. 2N/A ** When the database file grows, we must make sure that the last page 2N/A ** gets written at least once so that the disk file will be the correct 2N/A ** size. If you do not write this page and the size of the file 2N/A ** on the disk ends up being too small, that can lead to database 2N/A ** corruption during the next transaction. 2N/A** A call to this routine tells the pager that if a rollback occurs, 2N/A** it is not necessary to restore the data on the given page. This 2N/A** means that the pager does not have to record the given page in the 2N/A** Commit all changes to the database and release the write lock. 2N/A** If the commit fails for any reason, a rollback attempt is made 2N/A** and an error code is returned. If the commit worked, SQLITE_OK 2N/A /* Exit early (without doing the time-consuming sqliteOsSync() calls) 2N/A ** if there have been no changes to the database file. */ 2N/A /* Jump here if anything goes wrong during the commit process. 2N/A** Rollback all changes. The database falls back to read-only mode. 2N/A** All in-memory cache pages revert to their original data contents. 2N/A** The journal is deleted. 2N/A** This routine cannot fail unless some other process is not following 2N/A** the correct locking protocol (SQLITE_PROTOCOL) or unless some other 2N/A** process is writing trash into the journal file (SQLITE_CORRUPT) or 2N/A** unless a prior malloc() failed (SQLITE_NOMEM). Appropriate error 2N/A** codes are returned for all these occasions. Otherwise, 2N/A** SQLITE_OK is returned. 2N/A** Return TRUE if the database file is opened read-only. Return FALSE 2N/A** if the database is (in theory) writable. 2N/A** This routine is used for testing and analysis only. 2N/A** Set the checkpoint. 2N/A** This routine should be called with the transaction journal already 2N/A** open. A new checkpoint journal is created that can be used to rollback 2N/A** changes of a single SQL command within a larger transaction. 2N/A** Commit a checkpoint. 2N/A /* sqliteOsTruncate(&pPager->cpfd, 0); */ 2N/A** Rollback a checkpoint. 2N/A** Return the full pathname of the database file. 2N/A** Set the codec for this pager 2N/A** Print a listing of all referenced pages and their ref count.