简体   繁体   中英

With imap_tools (or imaplib), how do I sync imap changes instead of polling by repeatedly fetching the entire imap database?

Since there are several similar sounding questions around I want to be very precise.

Edit: Let's focus on specifically on reacting dynamically to any email message being moved from one folder to another.

A typical imap client app fetches only changes in the imap database since last sync. If your email client had to fetch every email each time you run it, that would take a long time.

Unfortunately my imap_tools app has to fetch (headers only) the entire imap database every time I run it. In order to detect changes dynamically, I would have to poll the entire set of messages repeatedly. Obviously, this is not a reasonable design.

Does imap_tools (or the underlying imaplib) provide a mechanism for syncing?

Using the "seen" flag is not it. That is for indicating whether a human has read the message, and also is not specific to the specific client.

Relying on uid is not quite it because I want to detect if the user has deleted or moved a message from one folder to another.

You can:

  • Use search args for limit data set: date_gte, date_lt, new...
  • Rely on message-id from headers if you store something
  • Use mailbox.move for reliable "mark" msg instead flags
  • Calculate msg hash

All depends on you task.

As I know, there is no "sync" in IMAP, there is IDLE, but imap_tools can not do it.

IMAP, at it's core, is an old and not terribly efficient protocol, as the design was not focused on syncing. Kundrát calls it a Cache Filing Protocol : the server is the one source of truth, and it is the client's job to display this to the user, and usually to cache as much of this as possible.

In Baseline IMAP, this generally means connecting to the server, and interrogating and caching as much information as the client cares to show. Number of messages, headers, flags, possibly bodies, maybe attachments.

It also assumes the client has a mostly stable network connection while it is in use, which was true of most desktop mode clients. Once you have all your data synced, the server can send you unsolicited responses : EXISTS when a new message comes in; STORE when flags are updated, EXPUNGE when a message is deleted. A server will not normally send these except in response to a permitted user command. Older clients often used NOOP , or perhaps CHECK for this.

If you lose your connection, clients will reconnect and refresh their cache. Since the only mutable things about messages is their existence and flags, this is usually fairly quick: the client will usually request all the flags for all messages. From there it can quickly update its cache. Apply flags. Fetch headers for new UIDs it discovered, remove the cached version of UIDs it didn't receive.

This does start to break down when a folder has many tens of thousands of messages, and you will find clients starts to have very slow startup/syncing speeds on some servers at this point, and start to use rather a lot of data.

IMAP as a protocol cannot track messages across folders. The state per folder is completely separate. If it is moved, it is equivalent to a removal from one folder and an add to another. Desktop clients often maintain a pool of connections to watch more than a folder at a time. You could apply heuristics to your cached messages to try to detect folder moves (eg, a selection of headers and metadata) but it can't be perfect.


As you can see, a lot of this is terribly inefficient once your mailbox grows past a few hundred messages, so there's a lot of extensions to make caching more efficient.

UIDPLUS (RFC4315) is almost everywhere. This requires the server to support UIDs in more commands, and is almost required for any cache-mode client, as message sequence numbers are unreliable when deletions are involved.

IDLE (RFC 2177) is fairly common, but not everywhere. The client can issue an IDLE command, and this tells the server it's ready for those unsolicited updates at any time. This means the client doesn't have to poll every few minutes with the NOOP command.

CONDSTORE (RFC 4551) is on most unix-type servers, and some commercial servers. It, among other things, associates a serial number with flag changes. This allows the flag resync step to only get the changes from the most recent serial number it knows about. It however does not help with detection with deleted messages, and a UID SEARCH ALL would still be necessary to find those after disconnection.

QRESYNC (RFC5162) provides resynchronization data for deleted messages. This unfortunately is a quite rare extension, and is almost nonexistent on large commercial servers.

NOTIFY (RFC5465) is almost nowhere. It's supposed to be like a super-IDLE that can monitor multiple mailboxes at the same time.

Gmail Extensions is of course Gmail specific. It, among other things, associates a permanent identifier with each message (X-GM-MSGID), which DOES allow it to be reliably tracked across folders. It also provides the "ALL MAIL" folder and Labels, which means you could sync the whole account by just syncing the All Mail folder. Like other servers, this does start to get bandwidth inefficient when hitting tens of thousands of messages.


From my experience of participating in the development of several mobile email clients which emphasized bandwidth efficiency and responsiveness, a client can appear very responsive even while dealing with all the problems of IMAP. IDLE can be used to try to keep the INBOX in sync. If you can't do that, you can hide a lot of jank by only keeping the most recent week's messages in total sync, and sync the rest less frequently (UID SEARCH SINCE is helpful here). The user is usually only looking at the end of their inbox, and generally only cares about new messages coming in.

And in general, mirroring the move of a message was actually just detected as a Delete and an Add, it's just internet connections and servers are super fast and something that takes a couple hundred ms might look instant to a user. If any optimization is occurring, it's heuristic. I think Thunderbird can have a protocol log you can turn on. If you're really curious what it's doing, turn it on and move a message and see what it does.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM