Dovecot

Welcome Back Courier-IMAP

Send to Kindle

When Matt was maintaining this server, starting back in 2001, he installed Courier-IMAP for our mail service (both IMAP and POP). It worked extremely well for many years. At one point, IMAP folders started taking a long time to open. Once they were open, performance was excellent. I think this was due to not enough simultaneous open connections allowed from the same IP address.

I’ve been maintaining this machine for a while, but I didn’t bother to do anything with Courier-IMAP even though it had started annoying me. Over 18 months ago, I switched to a new server. I decided to build it from scratch, correcting some legacy problems along the way. One of them was the above-mentioned IMAP hangs (not always, but annoying nonetheless).

After some research, I chose Dovecot as the new POP and IMAP server. I was impressed with how easy it was to install and configure. While there were tons of options to choose from, all of them are controlled in one dovecot.conf file, with extremely clear descriptions of what each choice entails. Once installed, it worked perfectly, and I was very pleased with my decision.

This joy lasted for more than a year! Then, at some point (possibly after my server was physically moved from one data center to another), a few times a day (typically 1-3 times), Dovecot would think that the system clock had moved backwards in time (it never does, and no other program every complains about that). Dovecot sees this as a very bad event (understandably) and exits automatically.

After noticing this a few times, I installed a monitoring program called Monit. I am running a 5.0 beta version, but monit has been flawless in every respect, even in beta. Since I installed monit, every time that Dovecot would quit, monit would restart it within 30 seconds, and email me that it just restarted it. That’s how I know it’s a daily event, sometimes multiple times a day.

I’ve lived with this nonsense for way too long. Each time, I assumed that the next version of Dovecot would magically fix my problem, even though it started on a version that had worked perfectly for months before the problem started! As much as I like everything else about Dovecot, I finally gave up.

This weekend, I installed Courier-IMAP (a much newer version than the one we used to run). I made sure to allow more concurrent connections from the same IP address (both Lois and I always come across as coming from the same IP address). I had a few hassles with the configuration (even though there are way fewer things to choose than with Dovecot). After about an hour of messing around (probably all my own fault, over-thinking some of the choices), I got it working.

There was only one side-effect to the change. Under Dovecot, my IMAP folders were shown in the mail client as follows:

INBOX:

———>XXX

———>YYY

———>ZZZ

After switching to Courier-IMAP, the structure in the mail clients was:

INBOX:

———>INBOX:

——————–>XXX

——————–>YYY

——————–>ZZZ

We can all live with it, and no mail was lost. It’s a minor nuisance.

I tested on a separate port, and when POP and IMAP were working, I turned Dovecot off and restarted Courier-IMAP with the correct default ports.

I then wrote an email to all of my users (all four of them). 😉

However, when I went to send the email, the send failed with a SASL authentication error. Ugh. I have saslauthd on the system (it wasn’t running, because Dovecot was performing that service as well). I started it up, but even though I played around a bit, I couldn’t get Postfix to authenticate correctly through it.

After looking at the top of the dovecot.conf file, I saw that by changing one line (which protocols Dovecot should handle) to “none”, it would run in SASL authentication mode only. That worked. So, now I still run Dovecot for authentication (since I didn’t have to change anything), and Courier-IMAP for mail fetching.

So far, the system has been running for a little over 24 hours, with no exits on the part of Courier-IMAP or the Dovecot auth daemon. I also haven’t had any hangs on opening an IMAP folder. It’s still very early, but the Dovecot IMAP server would have died at least once by now (guaranteed), so it’s already a win.

Here’s hoping that this will be a permanent change…

Update: First, so far so good. No exits in 4+ days! More important, I just stubmbled across a post that gave me the answer to my nested mailbox problem above. Apparently, Dovecot repeats the .INBOX in front of each sub-folder in the Maildir folder. In other words, .INBOX.SPAM is the SPAM folder, directly under the main INBOX in Dovecot. Courier-IMAP expects it to just be .SPAM in the top-level Maildir folder, in order to be considered a direct sub-folder of the main .INBOX.

I moved the folder names, unsubscribed the old one and subscribed to the new ones, and my folder hierarchy is now sane again. Whew. 😉

Update: It turns out that the fault lies not with Dovecot, but with some bad code in the Linux Kernel for certain hardware configurations (obviously, mine included) that causes the system clock to jump a few times a day. There is a long thread about it on the Dovecot mailing list, which points to a very long thread on the Linux Kernel maintainers list. So, while the problem definitely affected me, it’s not Dovecot’s fault, which correctly notices the jump, and decides that it’s safer to exit than to guess. Perhaps a future kernel update (I just applied one today) will solve the problem. I don’t feel like hand-patching my kernel, or Dovecot…

New Machine

Send to Kindle

On April 23rd I announced the christening of my new server. At the time, I put the percentage of services that had been ported over at 95. It’s been at least 5 days since I’ve been at 100%, so the new machine is definitely “official”. Everything has been updated to point to the new machine, and all but one thing are running as expected.

The only problem I have is with one VoIP provider. I can’t get any audio to work between us, and the problem is definitely on my end, which is the main reason for not naming the provider. I can still connect reliably to them from my old server, from a different server that I control, and from a softphone as well, so something is broken on my new server in the config for them. That said, all other providers work, including identically configured ones, so it’s not a firewall problem, nor generically a broken Asterisk install. I’m not happy with this, because I can’t think of anything more to test. I’ve written twice to the Asterisk mailing lists, with no useful suggestions left to try. 🙁

I could probably write for hours on the experience of building the new machine. Very few people would maintain interest in that, I’m sure. I also don’t need it for a cathartic release, because I took very copious notes on the whole thing in a Google Notebook.

So, I’ll try to boil the essence down here, with the hope of not losing your interest too quickly. 🙂

The purpose of the change was to upgrade the OS from Red Hat 9 to CentOS 5.0. That worked well. I actually installed CentOS 5.0 Beta first, and then did an upgrade through yum, which worked fine!

My first real disappointment was attempting to build OpenPKG on the new box. The concept sounded really cool to me. The biggest reason for moving from RH9 to CentOS5 was that newer RPMs were harder and harder to find for RH9. OpenPKG held out the promise that one wouldn’t have to worry about this in the future, with the added benefit that you would never accidentally step on the operating system’s packages.

Unfortunately, I ended up wasting a ton of time on it, and it eventually failed to install itself, claiming that gcc couldn’t create executables on the system. Of course it could, as I built quite a number of packages from source… So, great concept, just not right for me at this time…

Had a minor glitch with SELinux (first time I’ve been on a system that was running it). Had to temporarily disable some of the checks to get a package installed and running, but was able to turn it back on afterwards, and haven’t had a problem since.

I have been a very happy user of Courier-IMAP for years, and felt guilty about even considering an alternative (just a loyalty thing). But, I’d read a number of nice things about Dovecot, and it just went official 1.0 a few days before, so I decided to try it. I’m really happy with it. It worked correctly the first time, and configuration was as straightforward as I was led to believe. On the other hand, it wasn’t a quick config, because there are so many things that you can (and sometimes should) set. The single config file (which I like!) is huge, because it’s so well documented, that the choices are relatively simple. You just have to read all those darn docs… 😉

Also installed the latest Postfix 2.4.0. I’ve been really happy with Postfix for years, and had little intention of switching that.

One minor nit about Linux in general. It’s a little annoying that dependencies can get out of whack quite easily. Some system thing depends on openssl-0.9.7 (for example), and you know that 0.9.8e fixes some bugs, and perhaps some new software you’re installing wants that. So, now it needs to go in it’s own directory (’cause you can’t mess with the system one), and then every package has to be told where to find the new one, etc. It all works, but it’s still a PITA.

Installed the latest WordPress (which of course meant MySQL and PHP, etc.). This time, the email config problem that I had on the old machine just disappeared (hooray!). I didn’t config it any differently, so who knows what was wrong before…

Installed the latest Zope (2.10.3, not Zope 3), and had remarkably few problems slurping up my old Data.fs file from a Zope 2.6.x installation. Very cool.

Switched from one webmail client to another, even though I had been happy with the former for years. The latter does more, of which I’m sure I won’t partake of the additional functionality anyway. It works, so that’s all I care about. I rarely use webmail, but when it’s necessary, it’s also ultra convenient (and, as stated, necessary). 😉

One of the bigger odysseys was the installation of a Jabber server. This should probably be its own post, but if it was, I would never condense it, so I’ll do my best not to go on too much here. On the old machine, I was running jabberd-1.4.3 for years. Jabberd2 was just out at the time that I first installed 1.4.3 (they are not the same project). I was able to get jabberd2 to work at the time, but I could not get the AIM and ICQ transports to work, so I reverted to 1.4.3.

The jabberd14 project is still alive and kicking, and I could have saved a lot of headaches if I had stuck with it. But, for a while, I wanted to try ejabberd. It is the official server of jabber.org since February 2007, which seemed impressive to me. 😉

Ejabberd is written in Erlang, and is supposed to scale like crazy (not that I have the slightest need for scale). The concept intrigued me. I’ll spare you all of the insane problems I had getting it to work right. Suffice it to say that it was not my fault, which is rare in these situations. 😉

When I finally got it to work stably, I installed the Python-based AIM and ICQ transports (PyICQ-t and PyAIM-t). The AIM transport worked correctly, and the ICQ one was flaky (solution later on).

Then Rob Page asked me to take a look at Openfire (previously called Wildfire). It sounded cool, and since I was having a problem with the ICQ transport, I figured I’d give it a shot. Man, it installed so easily from RPM, didn’t touch a single file on the system, could be uninstalled trivially, etc. In summary, I liked it instantly. I wasn’t crazy about running a JVM on the system full time, but the load would be negligible, so I decided to switch to it. Of course, while it worked well, and the administration was wonderful, the ICQ plugin was experimental (the AIM one is production), and it behaved like an experimental plugin, which put me where the other one did. There were a few other small annoyances in Openfire as well.

That made me decide to go back and beat my head on the ejabberd server and transports. Long story short, after investigating my setup on the old machine (prompted by Z_God in the Python Transports conference room), I noticed that I didn’t understand how transports speak to the main server. I had them both speaking on the same port (which the sample config file showed!), but on the working server, each transport spoke to the server on its own port! I switched ICQ and AIM to speak to ejabberd on separate ports, and voila, it has been rock solid ever since. I have retired Openfire, and am a very happy ejabberd and python-transports customer! 🙂

That’s pretty much it (at least at a high level). I’m happy with the machine. As usual, more twists and turns than one hopes for, but also more learning experiences than I expected, and interesting ones at that, mostly ending in success. Now if I can only figure out that one SIP provider audio problem, I could get back to some serious poker playing. 😉