mail-cache-decisions.c revision 8fa86f7ef06aa6cf0239c7ca2eb98889691d40d4
e59faf65ce864fe95dc00f5d52b8323cdbd0608aTimo Sirainen/* Copyright (c) 2004-2011 Dovecot authors, see the included COPYING file */
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen/*
2d49f150b4bce6f2f59a84e268e4777901c3e42cTimo Sirainen Users can be divided to three groups:
16f816d3f3c32ae3351834253f52ddd0212bcbf3Timo Sirainen
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen 1. Most users will use only a single IMAP client which caches everything
1dd875d96ab5640f78250079961c10e99ed4aa79Timo Sirainen locally. For these users it's quite pointless to do any kind of caching
bb10ebcf076c959c752f583746d83805d7686df8Timo Sirainen as it only wastes disk space. That might also mean more disk I/O.
ffd9a1898a18fadfc5dce399162c25d50548f905Timo Sirainen
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen 2. Some users use multiple IMAP clients which cache everything locally.
89b548af722113acb5d63dfffb44423cb60f91e4Timo Sirainen These could benefit from caching until all clients have fetched the
31ddc75584c5cde53d2e78a737587f2e7fdcb0d2Timo Sirainen data. After that it's useless.
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen
ac26a4607cb12b156f6a42f1ead2881bedd43d94Timo Sirainen 3. Some clients don't do permanent local caching at all. For example
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen Pine and webmails. These clients would benefit from caching everything.
66ae183b6e895216037bd921367670f4b0665911Timo Sirainen Some locally caching clients might also access some data from server
e86d0d34fe365da4c7ca4312d575bfcbf3a01c0eTimo Sirainen again, such as when searching messages. They could benefit from caching
a2f250a332dfc1e6cd4ffd196c621eb9dbf7b8a1Timo Sirainen only these fields.
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen
da5d50534cfca45d0aaaf0bdac17b287b4588809Timo Sirainen After thinking about these a while, I figured out that people who care
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen about performance most will be using Dovecot optimized LDA anyway
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen which updates the indexes/cache immediately. In that case even the first
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen user group would benefit from caching the same way as second group. LDA
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen reads the mail anyway, so it might as well extract some information
31ddc75584c5cde53d2e78a737587f2e7fdcb0d2Timo Sirainen about it and store them into cache.
31ddc75584c5cde53d2e78a737587f2e7fdcb0d2Timo Sirainen
46c31f64b9f0949f00b7819f45b22f2d64b2ea27Timo Sirainen So, group 1. and 2. could be optimally implemented by keeping things
d6badc27cd6e8d3398877b6766cb0aaeef3a7800Timo Sirainen cached only for a while. I thought a week would be good. When cache file
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen is compressed, everything older than week will be dropped.
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen But how to figure out if user is in group 3? One quite easy rule would
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen be to see if client is accessing messages older than a week. But with
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen only that rule we might have already dropped useful cached data. It's
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen not very nice if we have to read and cache it twice.
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen Most locally caching clients always fetch new messages (all but body)
024815ea2ffdda9ea79919f18e865663977f73eaTimo Sirainen when they see them. They fetch them in ascending order. Noncaching
8fa41238067c854435884c459963fde6f8c6436bTimo Sirainen clients might fetch messages in pretty much any order, as they usually
8fa41238067c854435884c459963fde6f8c6436bTimo Sirainen don't fetch everything they can, only what's visible in screen. Some
91dca97b367c54a139c268b56a0c67f564bd9197Timo Sirainen will use server side sorting/threading which also makes messages to be
91dca97b367c54a139c268b56a0c67f564bd9197Timo Sirainen fetched in random order. Second rule would then be that if a session
46c31f64b9f0949f00b7819f45b22f2d64b2ea27Timo Sirainen doesn't fetch messages in ascending order, the fetched field type will
46c31f64b9f0949f00b7819f45b22f2d64b2ea27Timo Sirainen be permanently cached.
d6badc27cd6e8d3398877b6766cb0aaeef3a7800Timo Sirainen
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen So, we have three caching decisions:
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen
036626b19f14bef582f96e556913ae91b1d67881Timo Sirainen 1. Don't cache: Clients have never wanted the field
036626b19f14bef582f96e556913ae91b1d67881Timo Sirainen 2. Cache temporarily: Clients want this only once
16c89b1260c9d07c01c83a9219424d3727069b2eTimo Sirainen 3. Cache permanently: Clients want this more than once
16c89b1260c9d07c01c83a9219424d3727069b2eTimo Sirainen
5aeb15e5817fbd4b1d8de540aa7673e3819a8030Timo Sirainen Different mailboxes have different decisions. Different fields have
5aeb15e5817fbd4b1d8de540aa7673e3819a8030Timo Sirainen different decisions.
41e1c7380edda701719d8ce1fb4d465d2ec4c84dTimo Sirainen
91dca97b367c54a139c268b56a0c67f564bd9197Timo Sirainen There are some problems, such as if a client accesses message older than
bb10ebcf076c959c752f583746d83805d7686df8Timo Sirainen a week, we can't know if user just started using a new client which is
3ccab0bac68040f179a7de45c516cec258e28fdbTimo Sirainen just filling its local cache for the first time. Or it might be a
3ccab0bac68040f179a7de45c516cec258e28fdbTimo Sirainen client user hasn't just used for over a week. In these cases we
2a6af811ea3de3cf9e2f15e446674dd21b0705f3Timo Sirainen shouldn't have marked the field to be permanently cached. User might
a2f250a332dfc1e6cd4ffd196c621eb9dbf7b8a1Timo Sirainen also switch clients from non-caching to caching.
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen So we should re-evaluate our caching decisions from time to time. This
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen is done by checking the above rules constantly and marking when was the
d5cebe7f98e63d4e2822863ef2faa4971e8b3a5dTimo Sirainen last time the decision was right. If decision hasn't matched for two
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen months, it's changed. I picked two months because people go to at least
d5cebe7f98e63d4e2822863ef2faa4971e8b3a5dTimo Sirainen one month vacations where they might still be reading mails, but with
d5cebe7f98e63d4e2822863ef2faa4971e8b3a5dTimo Sirainen different clients.
d5cebe7f98e63d4e2822863ef2faa4971e8b3a5dTimo Sirainen*/
9f32b9444d2a6db8f556d2c49ffceab1a59791ffTimo Sirainen
9f32b9444d2a6db8f556d2c49ffceab1a59791ffTimo Sirainen#include "lib.h"
bb10ebcf076c959c752f583746d83805d7686df8Timo Sirainen#include "ioloop.h"
2a6af811ea3de3cf9e2f15e446674dd21b0705f3Timo Sirainen#include "mail-cache-private.h"
3ccab0bac68040f179a7de45c516cec258e28fdbTimo Sirainen
648d24583c1574441c4fa0331a90bd4d6e7996c5Timo Sirainenvoid mail_cache_decision_state_update(struct mail_cache_view *view,
648d24583c1574441c4fa0331a90bd4d6e7996c5Timo Sirainen uint32_t seq, unsigned int field)
ee246b46953e4b94b2f22e093373674fa9155500Timo Sirainen{
287ba82a8da3eaa473b5735d4eeac2fb4c5d8117Timo Sirainen struct mail_cache *cache = view->cache;
bb10ebcf076c959c752f583746d83805d7686df8Timo Sirainen const struct mail_index_header *hdr;
46c31f64b9f0949f00b7819f45b22f2d64b2ea27Timo Sirainen uint32_t uid;
6a19e109ee8c5a6f688da83a86a7f6abeb71abddTimo Sirainen
e156adefc1260d31a145df2f5e9b3c82050d4163Timo Sirainen i_assert(field < cache->fields_count);
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen if (view->no_decision_updates)
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen return;
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen mail_index_lookup_uid(view->view, seq, &uid);
0cb2e8eb55e70f8ebe1e8349bdf49e4cbe5d8834Timo Sirainen hdr = mail_index_get_header(view->view);
b780aa272b742a43579cdb523cc79cc8d4521306Timo Sirainen
b780aa272b742a43579cdb523cc79cc8d4521306Timo Sirainen if (ioloop_time - cache->fields[field].last_used > 3600*24) {
b780aa272b742a43579cdb523cc79cc8d4521306Timo Sirainen /* update last_used about once a day */
80fc743146da5130de34174cdaad2576f103723fTimo Sirainen cache->fields[field].last_used = (uint32_t)ioloop_time;
b780aa272b742a43579cdb523cc79cc8d4521306Timo Sirainen if (cache->field_file_map[field] != (uint32_t)-1)
b780aa272b742a43579cdb523cc79cc8d4521306Timo Sirainen cache->field_header_write_pending = TRUE;
80fc743146da5130de34174cdaad2576f103723fTimo Sirainen }
80fc743146da5130de34174cdaad2576f103723fTimo Sirainen
20a802016205bbcafc90f164f769ea801f88d014Timo Sirainen if (cache->fields[field].field.decision != MAIL_CACHE_DECISION_TEMP) {
e156adefc1260d31a145df2f5e9b3c82050d4163Timo Sirainen /* a) forced decision
20a802016205bbcafc90f164f769ea801f88d014Timo Sirainen b) not cached, mail_cache_decision_add() will handle this
20a802016205bbcafc90f164f769ea801f88d014Timo Sirainen c) permanently cached already, okay. */
20a802016205bbcafc90f164f769ea801f88d014Timo Sirainen return;
e156adefc1260d31a145df2f5e9b3c82050d4163Timo Sirainen }
e156adefc1260d31a145df2f5e9b3c82050d4163Timo Sirainen
e156adefc1260d31a145df2f5e9b3c82050d4163Timo Sirainen /* see if we want to change decision from TEMP to YES */
20a802016205bbcafc90f164f769ea801f88d014Timo Sirainen if (uid < cache->fields[field].uid_highwater ||
20a802016205bbcafc90f164f769ea801f88d014Timo Sirainen uid < hdr->day_first_uid[7]) {
036626b19f14bef582f96e556913ae91b1d67881Timo Sirainen /* a) nonordered access within this session. if client doesn't
036626b19f14bef582f96e556913ae91b1d67881Timo Sirainen request messages in growing order, we assume it doesn't
036626b19f14bef582f96e556913ae91b1d67881Timo Sirainen have a permanent local cache.
036626b19f14bef582f96e556913ae91b1d67881Timo Sirainen b) accessing message older than one week. assume it's a
036626b19f14bef582f96e556913ae91b1d67881Timo Sirainen client with no local cache. if it was just a new client
036626b19f14bef582f96e556913ae91b1d67881Timo Sirainen generating the local cache for the first time, we'll
036626b19f14bef582f96e556913ae91b1d67881Timo Sirainen drop back to TEMP within few months. */
036626b19f14bef582f96e556913ae91b1d67881Timo Sirainen cache->fields[field].field.decision = MAIL_CACHE_DECISION_YES;
7797aa2479e99aeb71057b7a2584b2cb72e4d3f8Timo Sirainen cache->fields[field].decision_dirty = TRUE;
bbf796c17f02538058d7559bfe96d677e5b55015Timo Sirainen
bbf796c17f02538058d7559bfe96d677e5b55015Timo Sirainen if (cache->field_file_map[field] != (uint32_t)-1)
bbf796c17f02538058d7559bfe96d677e5b55015Timo Sirainen cache->field_header_write_pending = TRUE;
8e7da21696c9f8a6d5e601243fb6172ec85d47b2Timo Sirainen } else {
c27f03fa8fd2ef4acd1db814fae7d90e0eb9d3aeTimo Sirainen cache->fields[field].uid_highwater = uid;
c5454841b5067a22827556ca9bc7935d190f57baTimo Sirainen }
024815ea2ffdda9ea79919f18e865663977f73eaTimo Sirainen}
d161e3c2cde2bd8d5917840f68823a2259ed426eTimo Sirainen
ffd9a1898a18fadfc5dce399162c25d50548f905Timo Sirainenvoid mail_cache_decision_add(struct mail_cache_view *view, uint32_t seq,
ffd9a1898a18fadfc5dce399162c25d50548f905Timo Sirainen unsigned int field)
1e923fcf497665fe071a154c31fb452766b0b2deTimo Sirainen{
d161e3c2cde2bd8d5917840f68823a2259ed426eTimo Sirainen struct mail_cache *cache = view->cache;
1e923fcf497665fe071a154c31fb452766b0b2deTimo Sirainen uint32_t uid;
d161e3c2cde2bd8d5917840f68823a2259ed426eTimo Sirainen
d161e3c2cde2bd8d5917840f68823a2259ed426eTimo Sirainen i_assert(field < cache->fields_count);
c5454841b5067a22827556ca9bc7935d190f57baTimo Sirainen
c5454841b5067a22827556ca9bc7935d190f57baTimo Sirainen if (MAIL_CACHE_IS_UNUSABLE(cache) || view->no_decision_updates)
1175f27441385a7011629f295f42708f9a3a4ffcTimo Sirainen return;
c27f03fa8fd2ef4acd1db814fae7d90e0eb9d3aeTimo Sirainen
c27f03fa8fd2ef4acd1db814fae7d90e0eb9d3aeTimo Sirainen if (cache->fields[field].field.decision != MAIL_CACHE_DECISION_NO) {
c5454841b5067a22827556ca9bc7935d190f57baTimo Sirainen /* a) forced decision
c27f03fa8fd2ef4acd1db814fae7d90e0eb9d3aeTimo Sirainen b) we're already caching it, so it just wasn't in cache */
c27f03fa8fd2ef4acd1db814fae7d90e0eb9d3aeTimo Sirainen return;
c27f03fa8fd2ef4acd1db814fae7d90e0eb9d3aeTimo Sirainen }
5a07b37a9df398b5189c14872a600384208ab74bTimo Sirainen
c27f03fa8fd2ef4acd1db814fae7d90e0eb9d3aeTimo Sirainen /* field used the first time */
c5454841b5067a22827556ca9bc7935d190f57baTimo Sirainen cache->fields[field].field.decision = MAIL_CACHE_DECISION_TEMP;
c5454841b5067a22827556ca9bc7935d190f57baTimo Sirainen cache->fields[field].decision_dirty = TRUE;
c5454841b5067a22827556ca9bc7935d190f57baTimo Sirainen cache->field_header_write_pending = TRUE;
c5454841b5067a22827556ca9bc7935d190f57baTimo Sirainen
c5454841b5067a22827556ca9bc7935d190f57baTimo Sirainen mail_index_lookup_uid(view->view, seq, &uid);
c5454841b5067a22827556ca9bc7935d190f57baTimo Sirainen cache->fields[field].uid_highwater = uid;
c5454841b5067a22827556ca9bc7935d190f57baTimo Sirainen}
c5454841b5067a22827556ca9bc7935d190f57baTimo Sirainen