maildir-sync.c revision 305465bb1a4c5d90c4b4e2c2790eb05fa4ebc41e
bcb4e51a409d94ae670de96afb8483a4f7855294Stephan Bosch/* Copyright (c) 2004-2009 Dovecot authors, see the included COPYING file */
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen Here's a description of how we handle Maildir synchronization and
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen it's problems:
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen We want to be as efficient as we can. The most efficient way to
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen check if changes have occurred is to stat() the new/ and cur/
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen directories and uidlist file - if their mtimes haven't changed,
c68f28e2cf5f9621511bece0414335e551dc82c6Timo Sirainen there's no changes and we don't need to do anything.
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen Problem 1: Multiple changes can happen within a single second -
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen nothing guarantees that once we synced it, someone else didn't just
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen then make a modification. Such modifications wouldn't get noticed
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen until a new modification occurred later.
99695d99930b35c2bac85d52e976b44cf8485d83Timo Sirainen Problem 2: Syncing cur/ directory is much more costly than syncing
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen new/. Moving mails from new/ to cur/ will always change mtime of
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen cur/ causing us to sync it as well.
e42e27fcc497c7b4a5cc0b6ff304abca5ccfcb4fTimo Sirainen Problem 3: We may not be able to move mail from new/ to cur/
e42e27fcc497c7b4a5cc0b6ff304abca5ccfcb4fTimo Sirainen because we're out of quota, or simply because we're accessing a
e42e27fcc497c7b4a5cc0b6ff304abca5ccfcb4fTimo Sirainen read-only mailbox.
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen MAILDIR_SYNC_SECS
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen -----------------
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen Several checks below use MAILDIR_SYNC_SECS, which should be maximum
99695d99930b35c2bac85d52e976b44cf8485d83Timo Sirainen clock drift between all computers accessing the maildir (eg. via
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen NFS), rounded up to next second. Our default is 1 second, since
99695d99930b35c2bac85d52e976b44cf8485d83Timo Sirainen everyone should be using NTP.
99695d99930b35c2bac85d52e976b44cf8485d83Timo Sirainen Note that setting it to 0 works only if there's only one computer
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen accessing the maildir. It's practically impossible to make two
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen clocks _exactly_ synchronized.
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen It might be possible to only use file server's clock by looking at
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen the atime field, but I don't know how well that would actually work.
b9f564d00b7a115f465ffd6840341c7b8f9bfc8aTimo Sirainen cur directory
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen We have dirty_cur_time variable which is set to cur/ directory's
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen mtime when it's >= time() - MAILDIR_SYNC_SECS and we _think_ we have
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen synchronized the directory.
7631f16156aca373004953fe6b01a7f343fb47e0Timo Sirainen When dirty_cur_time is non-zero, we don't synchronize the cur/
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen directory until
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen a) cur/'s mtime changes
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen b) opening a mail fails with ENOENT
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen c) time() > dirty_cur_time + MAILDIR_SYNC_SECS
efe78d3ba24fc866af1c79b9223dc0809ba26cadStephan Bosch This allows us to modify the maildir multiple times without having
e6440616c02bb1404dc35debf45d9741260c7831Timo Sirainen to sync it at every change. The sync will eventually be done to
e6440616c02bb1404dc35debf45d9741260c7831Timo Sirainen make sure we didn't miss any external changes.
d4002fe1f64d25a792f76fb102ef7dc519cd4e24Martti Rannanjärvi The dirty_cur_time is set when:
d4002fe1f64d25a792f76fb102ef7dc519cd4e24Martti Rannanjärvi - we change message flags
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen - we expunge messages
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen - we move mail from new/ to cur/
99695d99930b35c2bac85d52e976b44cf8485d83Timo Sirainen - we sync cur/ directory and it's mtime is >= time() - MAILDIR_SYNC_SECS
99695d99930b35c2bac85d52e976b44cf8485d83Timo Sirainen It's unset when we do the final syncing, ie. when mtime is
99695d99930b35c2bac85d52e976b44cf8485d83Timo Sirainen older than time() - MAILDIR_SYNC_SECS.
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen new directory
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen -------------
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen If new/'s mtime is >= time() - MAILDIR_SYNC_SECS, always synchronize
804fa3f03bd9170272168a5ad214053bbe3160c7Josef 'Jeff' Sipek it. dirty_cur_time-like feature might save us a few syncs, but
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen that might break a client which saves a mail in one connection and
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen tries to fetch it in another one. new/ directory is almost always
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen empty, so syncing it should be very fast anyway. Actually this can
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen still happen if we sync only new/ dir while another client is also
99695d99930b35c2bac85d52e976b44cf8485d83Timo Sirainen moving mails from it to cur/ - it takes us a while to see them.
b9f564d00b7a115f465ffd6840341c7b8f9bfc8aTimo Sirainen That's pretty unlikely to happen however, and only way to fix it
b9f564d00b7a115f465ffd6840341c7b8f9bfc8aTimo Sirainen would be to always synchronize cur/ after new/.
c68f28e2cf5f9621511bece0414335e551dc82c6Timo Sirainen Normally we move all mails from new/ to cur/ whenever we sync it. If
c68f28e2cf5f9621511bece0414335e551dc82c6Timo Sirainen it's not possible for some reason, we mark the mail with "probably
c68f28e2cf5f9621511bece0414335e551dc82c6Timo Sirainen exists in new/ directory" flag.
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen If rename() still fails because of ENOSPC or EDQUOT, we still save
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen the flag changes in index with dirty-flag on. When moving the mail
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen to cur/ directory, or when we notice it's already moved there, we
e6440616c02bb1404dc35debf45d9741260c7831Timo Sirainen apply the flag changes to the filename, rename it and remove the
e6440616c02bb1404dc35debf45d9741260c7831Timo Sirainen dirty flag. If there's dirty flags, this should be tried every time
99695d99930b35c2bac85d52e976b44cf8485d83Timo Sirainen after expunge or when closing the mailbox.
48325adac125d7ff275ec69b05b7a92be9637630Timo Sirainen This file contains UID <-> filename mappings. It's updated only when
1b6c4fdd2bb4234b5711874b3845547f49649744Timo Sirainen new mail arrives, so it may contain filenames that have already been
b9f564d00b7a115f465ffd6840341c7b8f9bfc8aTimo Sirainen deleted. Updating is done by getting uidlist.lock file, writing the
b9f564d00b7a115f465ffd6840341c7b8f9bfc8aTimo Sirainen whole uidlist into it and rename()ing it over the old uidlist. This
b9f564d00b7a115f465ffd6840341c7b8f9bfc8aTimo Sirainen means there's no need to lock the file for reading.
ad9afb64630511d5e25bc5bc11c5304986156928Timo Sirainen Whenever uidlist is rewritten, it's mtime must be larger than the old
ad9afb64630511d5e25bc5bc11c5304986156928Timo Sirainen one's. Use utime() before rename() if needed. Note that inode checking
ad9afb64630511d5e25bc5bc11c5304986156928Timo Sirainen wouldn't have been sufficient as inode numbers can be reused.
ad9afb64630511d5e25bc5bc11c5304986156928Timo Sirainen This file is usually read the first time you need to know filename for
d1ba8ecbb936ace90179d2292952546708d68f71Timo Sirainen given UID. After that it's not re-read unless new mails come that we
ad9afb64630511d5e25bc5bc11c5304986156928Timo Sirainen don't know about.
d4002fe1f64d25a792f76fb102ef7dc519cd4e24Martti Rannanjärvi broken clients
d4002fe1f64d25a792f76fb102ef7dc519cd4e24Martti Rannanjärvi --------------
d4002fe1f64d25a792f76fb102ef7dc519cd4e24Martti Rannanjärvi Originally the middle identifier in Maildir filename was specified
efb83f10b2a557d7427c311da52d768fb91e1b47Timo Sirainen only as <process id>_<delivery counter>. That however created a
efb83f10b2a557d7427c311da52d768fb91e1b47Timo Sirainen problem with randomized PIDs which made it possible that the same
c25dfa96bc32e8841c9a8cf5ba02fffba4290160Timo Sirainen PID was reused within one second.
3177b410680f3915549719f84a4acbffd4f9c561Timo Sirainen So if within one second a mail was delivered, MUA moved it to cur/
3177b410680f3915549719f84a4acbffd4f9c561Timo Sirainen and another mail was delivered by a new process using same PID as
c3d9da3955043aef88c17b71f2081e894186aa6bTimo Sirainen the first one, we likely ended up overwriting the first mail when
c3d9da3955043aef88c17b71f2081e894186aa6bTimo Sirainen the second mail was moved over it.
c25dfa96bc32e8841c9a8cf5ba02fffba4290160Timo Sirainen Nowadays everyone should be giving a bit more specific identifier,
e6440616c02bb1404dc35debf45d9741260c7831Timo Sirainen for example include microseconds in it which Dovecot does.
1f19649986397419d014febd1337c6eb7b530f26Timo Sirainen There's a simple way to prevent this from happening in some cases:
1f19649986397419d014febd1337c6eb7b530f26Timo Sirainen Don't move the mail from new/ to cur/ if it's mtime is >= time() -
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen MAILDIR_SYNC_SECS. The second delivery's link() call then fails
de62ce819d59a529530da4b57be1b8d6dad13d6bTimo Sirainen because the file is already in new/, and it will then use a
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen different filename. There's a few problems with this however:
99695d99930b35c2bac85d52e976b44cf8485d83Timo Sirainen - it requires extra stat() call which is unneeded extra I/O
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen - another MUA might still move the mail to cur/
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen - if first file's flags are modified by either Dovecot or another
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen MUA, it's moved to cur/ (you _could_ just do the dirty-flagging
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen but that'd be ugly)
efe78d3ba24fc866af1c79b9223dc0809ba26cadStephan Bosch Because this is useful only for very few people and it requires
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen extra I/O, I decided not to implement this. It should be however
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen quite easy to do since we need to be able to deal with files in new/
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen It's also possible to never accidentally overwrite a mail by using
1f19649986397419d014febd1337c6eb7b530f26Timo Sirainen link() + unlink() rather than rename(). This however isn't very
1f19649986397419d014febd1337c6eb7b530f26Timo Sirainen good idea as it introduces potential race conditions when multiple
1f19649986397419d014febd1337c6eb7b530f26Timo Sirainen clients are accessing the mailbox:
1f19649986397419d014febd1337c6eb7b530f26Timo Sirainen Trying to move the same mail from new/ to cur/ at the same time:
1f19649986397419d014febd1337c6eb7b530f26Timo Sirainen a) Client 1 uses slightly different filename than client 2,
b58aafbd21b365117538f73f306d22f75acd91f1Timo Sirainen for example one sets read-flag on but the other doesn't.
f89eb8f2cda0bd6d40a9f96db1c92517f0593871Martti Rannanjärvi You have the same mail duplicated now.
1f19649986397419d014febd1337c6eb7b530f26Timo Sirainen b) Client 3 sees the mail between Client 1's and 2's link() calls
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen and changes it's flag. You have the same mail duplicated now.
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen And it gets worse when they're unlink()ing in cur/ directory:
99695d99930b35c2bac85d52e976b44cf8485d83Timo Sirainen c) Client 1 changes mails's flag and client 2 changes it back
a8bc64d2ec8babb5109fa23aa3c90383de61cd69Timo Sirainen between 1's link() and unlink(). The mail is now expunged.
9865d9e7c5713e41db939222ed9c0225a11fb99eTimo Sirainen d) If you try to deal with the duplicates by unlink()ing another
2b9dbb270ad82e58d5f3581436e6f143176d5819Timo Sirainen one of them, you might end up unlinking both of them.
a8bc64d2ec8babb5109fa23aa3c90383de61cd69Timo Sirainen So, what should we do then if we notice a duplicate? First of all,
147a788fea2a88f7125b27226451271d55cf5b01Timo Sirainen it might not be a duplicate at all, readdir() might have just
147a788fea2a88f7125b27226451271d55cf5b01Timo Sirainen returned it twice because it was just renamed. What we should do is
147a788fea2a88f7125b27226451271d55cf5b01Timo Sirainen create a completely new base name for it and rename() it to that.
9865d9e7c5713e41db939222ed9c0225a11fb99eTimo Sirainen If the call fails with ENOENT, it only means that it wasn't a
2b9dbb270ad82e58d5f3581436e6f143176d5819Timo Sirainen duplicate after all.
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen/* When rename()ing many files from new/ to cur/, it's possible that next
e2a88d59c0d47d63ce1ad5b1fd95e487124a3fd4Timo Sirainen readdir() skips some files. we don't of course wish to lose them, so we
1ac7c8e9040e0d0b7e9f849e45b94bfe919595a9Timo Sirainen go and rescan the new/ directory again from beginning until no files are
9865d9e7c5713e41db939222ed9c0225a11fb99eTimo Sirainen left. This value is just an optimization to avoid checking the directory
9865d9e7c5713e41db939222ed9c0225a11fb99eTimo Sirainen twice unneededly. usually only NFS is the problem case. 1 is the safest
9865d9e7c5713e41db939222ed9c0225a11fb99eTimo Sirainen bet here, but I guess 5 will do just fine too. */
9865d9e7c5713e41db939222ed9c0225a11fb99eTimo Sirainen/* This is mostly to avoid infinite looping when rename() destination already
9865d9e7c5713e41db939222ed9c0225a11fb99eTimo Sirainen exists as the hard link of the file itself. */
9865d9e7c5713e41db939222ed9c0225a11fb99eTimo Sirainen struct maildir_uidlist_sync_ctx *uidlist_sync_ctx;
9865d9e7c5713e41db939222ed9c0225a11fb99eTimo Sirainen struct maildir_index_sync_context *index_sync_ctx;
9865d9e7c5713e41db939222ed9c0225a11fb99eTimo Sirainenvoid maildir_sync_notify(struct maildir_sync_context *ctx)
/* we got here from maildir-save.c. it has no
static struct maildir_sync_context *
return ctx;
fname2);
const char *path;
unsigned int i = 0, move_count = 0;
#ifdef HAVE_DIRFD
if (new_dir) {
errno = 0;
flags = 0;
if (move_new) {
move_count++;
move_count++;
} else if (new_dir) {
if ((i % MAILDIR_SLOW_CHECK_COUNT) == 0)
if (ret <= 0) {
if (ret < 0)
T_BEGIN {
} T_END;
if (ret < 0)
if (errno != 0) {
if (dir_changed) {
const void *data;
if (data_size == 0) {
(undirty || \
if (!*new_changed_r) {
if (!*cur_changed_r) {
if (check_new)
if (check_cur)
return FALSE;
int ret;
if (forced)
if (ret <= 0)
return ret;
if (!cur_changed) {
sync_flags = 0;
if (forced)
if (ret <= 0) {
if (ret == 0) {
if (forced) {
if (ret <= 0) {
unsigned int count = 0;
if (ret < 0)
if (cur_changed) {
if (ret < 0)
if (ret == 0)
*find_uid = 0;
*find_uid = 0;
bool lost_files;
int ret;
T_BEGIN {
} T_END;
if (uid != 0) {
T_BEGIN {
&lost_files);
} T_END;
return ret;
struct mailbox_sync_context *
int ret = 0;
T_BEGIN {
&lost_files);
} T_END;
if (lost_files) {
int ret;
T_BEGIN {
} T_END;