maildir-sync.c revision 0e2686dfe29a18772fa4026bad53e2c7c560403f
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen/* Copyright (C) 2004 Timo Sirainen */
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen Here's a description of how we handle Maildir synchronization and
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen it's problems:
58be9d6bcc3800f5b3d76a064ee767fbe31a5a8aTimo Sirainen We want to be as efficient as we can. The most efficient way to
58be9d6bcc3800f5b3d76a064ee767fbe31a5a8aTimo Sirainen check if changes have occured is to stat() the new/ and cur/
13c6532dc104d23061e6901783ceb1ff8872c206Timo Sirainen directories and uidlist file - if their mtimes haven't changed,
1098fc409a45e7603701dc94635927a673bee0c1Timo Sirainen there's no changes and we don't need to do anything.
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen Problem 1: Multiple changes can happen within a single second -
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen nothing guarantees that once we synced it, someone else didn't just
d22390f33eedbd2413debabc0662dde5241b1aa6Timo Sirainen then make a modification. Such modifications wouldn't get noticed
d22390f33eedbd2413debabc0662dde5241b1aa6Timo Sirainen until a new modification occured later.
6ef7e31619edfaa17ed044b45861d106a86191efTimo Sirainen Problem 2: Syncing cur/ directory is much more costly than syncing
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen new/. Moving mails from new/ to cur/ will always change mtime of
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen cur/ causing us to sync it as well.
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainen Problem 3: We may not be able to move mail from new/ to cur/
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen because we're out of quota, or simply because we're accessing a
4edf90751f075cc6ab3d6f53fc78b656efa80922Timo Sirainen read-only mailbox.
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen MAILDIR_SYNC_SECS
72e9e7ad158101d46860b42c4080e894485c78c3Timo Sirainen -----------------
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen Several checks below use MAILDIR_SYNC_SECS, which should be maximum
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen clock drift between all computers accessing the maildir (eg. via
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen NFS), rounded up to next second. Our default is 1 second, since
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen everyone should be using NTP.
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen Note that setting it to 0 works only if there's only one computer
4823da41d112ff9f5e8f088b0e60d1636e01ff92Timo Sirainen accessing the maildir. It's practically impossible to make two
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen clocks _exactly_ synchronized.
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen It might be possible to only use file server's clock by looking at
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen the atime field, but I don't know how well that would actually work.
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen cur directory
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen -------------
72e9e7ad158101d46860b42c4080e894485c78c3Timo Sirainen We have dirty_cur_time variable which is set to cur/ directory's
4823da41d112ff9f5e8f088b0e60d1636e01ff92Timo Sirainen mtime when it's >= time() - MAILDIR_SYNC_SECS and we _think_ we have
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen synchronized the directory.
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen When dirty_cur_time is non-zero, we don't synchronize the cur/
4823da41d112ff9f5e8f088b0e60d1636e01ff92Timo Sirainen directory until
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen a) cur/'s mtime changes
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen b) opening a mail fails with ENOENT
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen c) time() > dirty_cur_time + MAILDIR_SYNC_SECS
4823da41d112ff9f5e8f088b0e60d1636e01ff92Timo Sirainen This allows us to modify the maildir multiple times without having
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen to sync it at every change. The sync will eventually be done to
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen make sure we didn't miss any external changes.
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainen The dirty_cur_time is set when:
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainen - we change message flags
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen - we expunge messages
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen - we move mail from new/ to cur/
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen - we sync cur/ directory and it's mtime is >= time() - MAILDIR_SYNC_SECS
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen It's unset when we do the final syncing, ie. when mtime is
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainen older than time() - MAILDIR_SYNC_SECS.
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen new directory
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen -------------
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen If new/'s mtime is >= time() - MAILDIR_SYNC_SECS, always synchronize
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen it. dirty_cur_time-like feature might save us a few syncs, but
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen that might break a client which saves a mail in one connection and
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen tries to fetch it in another one. new/ directory is almost always
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen empty, so syncing it should be very fast anyway. Actually this can
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen still happen if we sync only new/ dir while another client is also
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen moving mails from it to cur/ - it takes us a while to see them.
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen That's pretty unlikely to happen however, and only way to fix it
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen would be to always synchronize cur/ after new/.
dd4b5f14b71b01a84af942e720a2d6e5f15ee1a7Timo Sirainen Normally we move all mails from new/ to cur/ whenever we sync it. If
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainen it's not possible for some reason, we mark the mail with "probably
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen exists in new/ directory" flag.
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainen If rename() still fails because of ENOSPC or EDQUOT, we still save
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen the flag changes in index with dirty-flag on. When moving the mail
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainen to cur/ directory, or when we notice it's already moved there, we
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainen apply the flag changes to the filename, rename it and remove the
1171f0abf442638bac1827bb24a0b6b8eb682a82Timo Sirainen dirty flag. If there's dirty flags, this should be tried every time
4edf90751f075cc6ab3d6f53fc78b656efa80922Timo Sirainen after expunge or when closing the mailbox.
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen This file contains UID <-> filename mappings. It's updated only when
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen new mail arrives, so it may contain filenames that have already been
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainen deleted. Updating is done by getting uidlist.lock file, writing the
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainen whole uidlist into it and rename()ing it over the old uidlist. This
d5ac54ef50db16b50689b5c8b7bb64d344190832Timo Sirainen means there's no need to lock the file for reading.
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen Whenever uidlist is rewritten, it's mtime must be larger than the old
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen one's. Use utime() before rename() if needed. Note that inode checking
6a19e109ee8c5a6f688da83a86a7f6abeb71abddTimo Sirainen wouldn't have been sufficient as inode numbers can be reused.
6a19e109ee8c5a6f688da83a86a7f6abeb71abddTimo Sirainen This file is usually read the first time you need to know filename for
6a19e109ee8c5a6f688da83a86a7f6abeb71abddTimo Sirainen given UID. After that it's not re-read unless new mails come that we
6a19e109ee8c5a6f688da83a86a7f6abeb71abddTimo Sirainen don't know about.
6a19e109ee8c5a6f688da83a86a7f6abeb71abddTimo Sirainen broken clients
6a19e109ee8c5a6f688da83a86a7f6abeb71abddTimo Sirainen --------------
6a19e109ee8c5a6f688da83a86a7f6abeb71abddTimo Sirainen Originally the middle identifier in Maildir filename was specified
6a19e109ee8c5a6f688da83a86a7f6abeb71abddTimo Sirainen only as <process id>_<delivery counter>. That however created a
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen problem with randomized PIDs which made it possible that the same
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen PID was reused within one second.
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainen So if within one second a mail was delivered, MUA moved it to cur/
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen and another mail was delivered by a new process using same PID as
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen the first one, we likely ended up overwriting the first mail when
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen the second mail was moved over it.
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainen Nowadays everyone should be giving a bit more specific identifier,
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen for example include microseconds in it which Dovecot does.
4edf90751f075cc6ab3d6f53fc78b656efa80922Timo Sirainen There's a simple way to prevent this from happening in some cases:
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen Don't move the mail from new/ to cur/ if it's mtime is >= time() -
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen MAILDIR_SYNC_SECS. The second delivery's link() call then fails
325d4ad220bd13f6d176391d962a0e33c856a7f6Timo Sirainen because the file is already in new/, and it will then use a
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen different filename. There's a few problems with this however:
fd2f5fbc1f07aa93e2214a28cdf02437fb7d06c8Timo Sirainen - it requires extra stat() call which is unneeded extra I/O
fd2f5fbc1f07aa93e2214a28cdf02437fb7d06c8Timo Sirainen - another MUA might still move the mail to cur/
df4018ae2f0a95be602f724ca70df7e0e3bd6a7dTimo Sirainen - if first file's flags are modified by either Dovecot or another
fd2f5fbc1f07aa93e2214a28cdf02437fb7d06c8Timo Sirainen MUA, it's moved to cur/ (you _could_ just do the dirty-flagging
fd2f5fbc1f07aa93e2214a28cdf02437fb7d06c8Timo Sirainen but that'd be ugly)
fd2f5fbc1f07aa93e2214a28cdf02437fb7d06c8Timo Sirainen Because this is useful only for very few people and it requires
fd2f5fbc1f07aa93e2214a28cdf02437fb7d06c8Timo Sirainen extra I/O, I decided not to implement this. It should be however
fd2f5fbc1f07aa93e2214a28cdf02437fb7d06c8Timo Sirainen quite easy to do since we need to be able to deal with files in new/
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen It's also possible to never accidentally overwrite a mail by using
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen link() + unlink() rather than rename(). This however isn't very
7c95b03620a03a43dd72d39608cea5fc77393ad6Timo Sirainen good idea as it introduces potential race conditions when multiple
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen clients are accessing the mailbox:
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen Trying to move the same mail from new/ to cur/ at the same time:
811f2e26d9782d9cb99fdf82e18ffa0a77564fe2Timo Sirainen a) Client 1 uses slightly different filename than client 2,
811f2e26d9782d9cb99fdf82e18ffa0a77564fe2Timo Sirainen for example one sets read-flag on but the other doesn't.
fd2f5fbc1f07aa93e2214a28cdf02437fb7d06c8Timo Sirainen You have the same mail duplicated now.
4bbee99b3aef449a9a2a11a5b5cf1ca486915c49Timo Sirainen b) Client 3 sees the mail between Client 1's and 2's link() calls
fd2f5fbc1f07aa93e2214a28cdf02437fb7d06c8Timo Sirainen and changes it's flag. You have the same mail duplicated now.
4b058f90f9e8a2c6b2eed275de4eb8cc5195a71dTimo Sirainen And it gets worse when they're unlink()ing in cur/ directory:
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen c) Client 1 changes mails's flag and client 2 changes it back
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen between 1's link() and unlink(). The mail is now expunged.
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen d) If you try to deal with the duplicates by unlink()ing another
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen one of them, you might end up unlinking both of them.
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainen So, what should we do then if we notice a duplicate? First of all,
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen it might not be a duplicate at all, readdir() might have just
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen returned it twice because it was just renamed. What we should do is
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen create a completely new base name for it and rename() it to that.
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen If the call fails with ENOENT, it only means that it wasn't a
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen duplicate after all.
4edf90751f075cc6ab3d6f53fc78b656efa80922Timo Sirainen struct maildir_uidlist_sync_ctx *uidlist_sync_ctx;
992a13add4eea0810e4db0f042a595dddf85536aTimo Sirainenstatic int maildir_expunge(struct index_mailbox *ibox, const char *path,
fd2f5fbc1f07aa93e2214a28cdf02437fb7d06c8Timo Sirainenstatic int maildir_sync_flags(struct index_mailbox *ibox, const char *path,
fd2f5fbc1f07aa93e2214a28cdf02437fb7d06c8Timo Sirainen struct maildir_index_sync_context *ctx = context;
8aacc9e7c84f8376822823ec98c2f551d4919b2eTimo Sirainen (void)maildir_filename_get_flags(path, &flags, keywords);
d5cebe7f98e63d4e2822863ef2faa4971e8b3a5dTimo Sirainen mail_index_sync_flags_apply(&ctx->sync_rec, &flags8, keywords);
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen newpath = maildir_filename_set_flags(path, flags8, keywords);
58be9d6bcc3800f5b3d76a064ee767fbe31a5a8aTimo Sirainen if ((flags8 & MAIL_INDEX_MAIL_FLAG_DIRTY) != 0)
58be9d6bcc3800f5b3d76a064ee767fbe31a5a8aTimo Sirainen mail_index_update_flags(ctx->trans, ctx->seq, MODIFY_ADD,
58be9d6bcc3800f5b3d76a064ee767fbe31a5a8aTimo Sirainenstatic int maildir_sync_record(struct index_mailbox *ibox,
58be9d6bcc3800f5b3d76a064ee767fbe31a5a8aTimo Sirainen struct mail_index_sync_rec *sync_rec = &ctx->sync_rec;
58be9d6bcc3800f5b3d76a064ee767fbe31a5a8aTimo Sirainen /* make it go through sequences to avoid looping through huge
58be9d6bcc3800f5b3d76a064ee767fbe31a5a8aTimo Sirainen holes in UID range */
6fdf8b5e4e71a69f5974f59eec2b8c19bc421fe2Timo Sirainen if (mail_index_lookup_uid_range(view, sync_rec->uid1,
e015e2f7e7f48874495f9df8b0dd192b7ffcb5ccTimo Sirainen if (mail_index_lookup_uid(view, seq, &uid) < 0)
d22390f33eedbd2413debabc0662dde5241b1aa6Timo Sirainen if (maildir_file_do(ibox, uid, maildir_expunge,
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen if (mail_index_lookup_uid_range(view, sync_rec->uid1,
58be9d6bcc3800f5b3d76a064ee767fbe31a5a8aTimo Sirainen for (ctx->seq = seq1; ctx->seq <= seq2; ctx->seq++) {
58be9d6bcc3800f5b3d76a064ee767fbe31a5a8aTimo Sirainen if (mail_index_lookup_uid(view, ctx->seq, &uid) < 0)
58be9d6bcc3800f5b3d76a064ee767fbe31a5a8aTimo Sirainen /* flag isn't dirty anymore */
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainenint maildir_sync_last_commit(struct index_mailbox *ibox)
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen ret = mail_index_sync_begin(ibox->index, &ctx.sync_ctx, &ctx.view,
6fdf8b5e4e71a69f5974f59eec2b8c19bc421fe2Timo Sirainen ctx.trans = mail_index_transaction_begin(ctx.view, FALSE);
58be9d6bcc3800f5b3d76a064ee767fbe31a5a8aTimo Sirainen while ((ret = mail_index_sync_next(ctx.sync_ctx,
6fdf8b5e4e71a69f5974f59eec2b8c19bc421fe2Timo Sirainen if (mail_index_transaction_commit(ctx.trans, &seq, &offset) < 0)
6fdf8b5e4e71a69f5974f59eec2b8c19bc421fe2Timo Sirainenmaildir_sync_context_new(struct index_mailbox *ibox)
6fdf8b5e4e71a69f5974f59eec2b8c19bc421fe2Timo Sirainen ctx->new_dir = t_strconcat(ibox->path, "/new", NULL);
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen ctx->cur_dir = t_strconcat(ibox->path, "/cur", NULL);
58be9d6bcc3800f5b3d76a064ee767fbe31a5a8aTimo Sirainenstatic void maildir_sync_deinit(struct maildir_sync_context *ctx)
const char *old_fname)
int ret = 0;
t_push();
t_pop();
return ret;
const char *dir;
t_push();
if (ret == 0) {
if (new_dir)
if (ret < 0)
flags = 0;
if (move_new) {
} else if (new_dir) {
if (ret <= 0) {
if (ret < 0)
t_pop();
cur_mtime : 0;
const char *filename;
int ret;
seq = 0;
seq++;
if ((uflags &
MAILDIR_UIDLIST_REC_FLAG_NONSYNCED) != 0) {
if ((uflags &
MAILDIR_UIDLIST_REC_FLAG_RACING) != 0) {
filename);
seq--;
goto __again;
if ((uflags &
MAILDIR_UIDLIST_REC_FLAG_NONSYNCED) != 0) {
seq--;
seq--;
INDEX_KEYWORDS_BYTE_COUNT) != 0) {
if (!partial) {
if (uid_validity == 0) {
} else if (uid_validity == 0) {
if (ret < 0) {
else if (seq != 0) {
if (ret == 0) {
return ret;
if (!forced) {
return ret;
if (cur_changed) {
return ret;
int ret;
return ret;
struct mailbox_sync_context *
int ret = 0;