Computers

mp3tunes is a joke

Send to Kindle

mp3tunes.com sounded like a good idea when I first heard of it. I found out about it a while ago, in a round-a-bout manner. I discovered a music streaming application, that could be widgetized on a blog, and they were promoting mp3tunes as a place to store your mp3’s so that you could stream them from there.

So, I signed up for the free account. It came with one GB of storage for free. If you wanted more, you paid. Fair enough. I was only interested in free. I wasn’t sure I would even use the service. A few days later, I received an email from them that my account had been automatically upgraded to an unlimited storage account, still completely free.

Cool! There were still plenty of hooks to upsell you, mostly in terms of currently restricted features that would be unlocked if you paid, but I had no interest.

They have an installable application called LockerSync. It was in beta when I first signed up. It was awful in terms of performance, but if you kept restarting it after it died, and had tons of patience, eventually, it worked. This application keeps your local music files in sync with your music on their site. This is the primary use case I was interested in. An emergency backup for my mp3 files.

The software was so horrible, that even though I was on a blisteringly fast connection (including 5Mbps upstream), it took nearly a week to sync 18GB of files. Like I said above, eventually, it synced.

A few months later, the LockerSync software came out of beta, and it got more stable. It still wasn’t really fast, but I presume that they were throttling it on their side, to avoid large bandwidth bills. Again, fair enough.

I never accessed my files on their site, and never ended up streaming anything (other than testing on rare occasion that indeed the files were uploaded correctly).

After a few more months, I started getting errors on certain files. The software alerted me that the files were too large and I needed to upgrade to a paid account to complete the sync. The amazing thing is that every one of these files were synced months earlier, correctly. So, somehow, files were disappearing on the site.

An additional annoyance is that the LockerSync software autostarts with my login, and it takes forever to initialize, making the boot process pound the disk and leave everything else to load sluggishly. I took the program out of the startup sequence, because I don’t update my library daily anyway. Now I was just launching the program whenever I added music. Still, I was getting these error message on the large files.

Today, I launched the program to sync some new music. I received an error message claiming that my locker was full, and that I needed to upgrade to a paid account in order to continue. My full account had 2GB in it. So, they tossed 18-20GB of music that I had previously synced. Wow, I had the opportunity to pay people that operate a service like that, and hope that I could resync everything, and that they wouldn’t change the terms yet again.

Like I would trust them not to triple the price in a month, after tossing my files, just because that’s what suited them that day.

Folks, don’t start services, tell people that certain parts are free, and then change your mind. OK, you can do that, but wouldn’t it have been nice for them to send me an email telling me that I had 30 days, or whatever, before they were going to toss my files?

I’m glad they never saw a dime of my money, and for sure, any service that they ever offer in the future won’t have me as a customer either.

Vista SP1

Send to Kindle

If you stop by here regularly, then you know I’ve railed about Windows Vista at least once. Aside from hearing tons of complaints from friends and strangers, I had the displeasure of using my previously mentioned computer ministry to set up a new Dell computer for my cul-de-sac neighbor’s mom last year.

At that time, Dell was only offering Vista on some models of their laptops, something they then backed off of. The machine, a Dell 1501 was value priced, so I endorsed the purchase (Vista included) in advance. I only say this to make the point that I wasn’t against Vista before I experienced it directly!

Anyway, I had never seen a more unstable machine from the minute it was turned on. Since then, I haven’t been asked to work on it (thankfully), mostly because the mom really doesn’t use the machine much, and she’s not really a complainer by nature either.

Last week I wrote about working on one of their machines which ended up in a full reinstall of Windows XP. As a result, my friend asked me if I could take a look at her mom’s machine to make sure all was well. I agreed.

Yesterday afternoon her mom dropped off the laptop. When I booted it, it was every bit as unstable, and downright unpleasant to work on as I remembered. Every few minutes the desktop would restart itself (in XP, that might be explore.exe or explorer.exe, but I’m not really sure). Aside from that, everything took forever or hung awaiting some hidden dialog box requiring a click.

I struggled to apply Windows updates until the only one left was Vista SP1. That claimed to be a 66MB download, but even though I’m on a blisteringly fast Verizon FiOS connection, multiple attempts (at least five!) to install it all failed, with the package never fully downloading (on a wired connection).

I turned to my XP laptop and downloaded the full 484MB standalone Vista SP1 file directly to a flash drive. It came down so quickly I couldn’t believe it. I put the flash drive in the Vista laptop and ran the patch. It took a while (as they predicted), and rebooted a number of times (as they predicted), but when it was done, I had my first experience with a stable Vista machine!

It wasn’t half bad, and I’m man enough to admit it. I was then able to clean off some other bloatware, install some useful utilities, all without a hitch. The machine is running quite nicely, and I will be returning it to her later today.

I had heard similar things from other people. One friend of mine told me that Vista was crashing on her laptop a minimum of once a day. After she installed SP1, it hasn’t crashed even once, and that’s been over three months now.

Anyway, I’ll bash when it’s deserved, but correct when that is deserved as well, and it now seems reasonable to consider Vista if you are going with a Windows-based machine anyway.

If/when I get closer to buying a new laptop (could happen in a day, or in six months), I’ll share my thoughts on that, as I am heavily leaning towards running Vista 64 on any new machine I buy.

Reinstalling Windows XP

Send to Kindle

I spend a fair amount of my time helping many people with their personal computers (mostly laptops, but desktops as well). A friend of ours noted that I had my own Computer Ministry a while ago. Lois liked the term and printed T-Shirts labeled Pastor Pedhazur, with the names of many in my flock. 🙂

Among the many people I help are our cul-de-sac neighbors (I support the mom, dad and teenage daughter) with whom we’ve been good friends for 18+ years.

The Dad’s current laptop is a Dell E1405. It’s a nice small-to-midsize laptop running XP. When he bought it, it came with 1GB of RAM (two 512MB sticks) and an 80GB disk (with only 55GB of usable space, we’ll get to why in a bit).

It ran well for a while, but started giving him a bunch of trouble when his teenage daughter used it a few times (possibly before she got her own Dell for Christmas, or possibly because it was simply more convenient one day for whatever reason). Being a teenager, she visited sites she shouldn’t have, installed stuff (wittingly or otherwise), and the machine was infected with a number of viruses and spyware programs.

I used to fix their machines in their house, and chat with them while I was working away. There were many days (mostly on weekends) where I’d be there for upwards of five hours. I didn’t mind, because (like I said above), we’re good friends. But, I tend to play poker mostly on weekends now, so when they have a problem, I tend to pick up their machines (or they drop them off) and I work on them in my house.

So, a month ago I picked up his machine and spent the day cleaning off all of the viruses. That went reasonably smoothly. I also deleted the daughter’s account (with their permission), now that she definitely has her own full-time laptop (yes, I previously had to clean hers as well, but she’s kept it pretty clean since then).

Even though the cleansing went well, I noticed the dreaded BSOD (Blue Screen of Death) a number of times. In fact, I could provoke it at will (if waiting a full hour each time counts as at will). It felt like a bad memory chip, since it failed in the same way every time, but I couldn’t be sure.

I have a dozen or so utility CDs which I can boot off that have various tools on them to help me with my ministry. I pulled out one that had a variety of memory tests, booted from it, and let it run for a few hours. No joy, I couldn’t provoke any failures. Then I pulled one of the RAM chips out, and ran the one-hour program, and indeed, it still failed.

I then swapped the memory chips, and reran the program, and it ran to completion. I also moved that chip to the secondary slot, and it continued to work, so it appeared that neither slot was bad either! Aha, had to be a bad RAM chip, right? Perhaps, we’ll find out in a minute or two… 😉

As an aside, Dell puts the second slot under the keyboard, and the machinations you have to go through to get to that slot are nerve-wracking and unwieldy. Of course, now that I’ve swapped chips for him numerous times, I’m a daredevil at doing it and can swap the chips in my sleep… 🙂

He could have run on 512MB, but that’s not a lot of headroom for Windows XP. Instead, the next day, we ordered two 1GB modules. The following weekend I took the machine again, and put in the memory modules. I reran the long program, and it failed at the one hour mark, again… 🙁

I pulled one chip, reran the program, and it worked, so it appeared that perhaps one of the slots was bad (or flaky), but at least he was back to 1GB of memory, not a total loss. I returned the machine to him.

A week later, I called to ask how it was running. Sheepishly, he admitted that it was crashing a ton, but that it came back each time (in fact, it auto-rebooted), and he was limping along, able to check his email, etc., but not really use the machine for extended periods. Clearly, he felt badly imposing, but I felt badly that he didn’t!

We were leaving for VA that day, so I told him I’d check it out when we returned, which was a few days ago.

My plan was to reinstall Windows XP from scratch to be sure that it wasn’t a deep software corruption on his system. If a fresh install still showed regular crashes, then he had to have some kind of hardware problem.

I picked up the machine on Friday morning. I have a number of ways that I back machines up (mine and others) depending on the situation. For ultra-safety and complete backups, I tend to use a commercial program called EZ Gig II (it’s a custom Linux distro with some sort of packaging of a dd-like program, though not exactly that). The program came with an external disk drive kit that I bought a while ago, and I’ve been very satisfied with it. There are many free programs that accomplish the same thing.

However, I didn’t want to use that program for this task, because it creates a blob that can’t access individual files. You can restore the entire disk to another partition, but not copy over one file that you just forgot you’d need, etc. Still, I wanted a full backup, and I didn’t want to just copy all of the files.

I pulled out another of my trusty utility disks. This one is the BartPE CD, an excellent collection of Windows utilities on a bootable CD. The advantage of this over a Linux bootable CD is that you are really running Windows, so all disk activity is virtually guaranteed to work with NTFS filesystems, including writing. Don’t flame me, Linux has gotten really good at this too (with NTFS-3g in particular), and I do use it, reliably, but BartPE comes with a specific utility which I really wanted to use for this specific task!

That utility is DriveImage XML. This is very cool software, that works exactly as advertised. There are two reasons why I use EZ Gig II more frequently than this. EZ Gig is way faster and EZ Gig will allow me to store multiple images of the same drive on the same target (backup) drive. DriveImage XML is slower, and doesn’t let me name the backups, so it’s less flexible.

That said, it has a feature that makes it way better than EZ Gig for the task I needed, which is that it stores a map of the data it backed up in a separate XML file, and every file (or directory, etc.) can be restored (or rather extracted) separately.

So, I booted the Bart PE CD and ran DriveImage XML and backed up the drive to an external hard drive. I was now ready to reinstall Windows XP. For years, no one has been better than these folks at keeping every bit of stuff that came with any PC they purchased. It took them 30 seconds to find the original CDs and manual.

But, I mentioned above that the disk had 55 usable GB on it. That’s because it had a hidden restore partition on it as well. I found the key combination to boot off of it from the manual (actually, the online manual at Dell’s site, since it was easier to search that). When I booted the machine, I pressed Ctrl-F11 and was presented with the Recovery Menu.

I chose a full reinstall. The machine chugged along, and a litte bit later, it was like new from the factory (bloatware too), including Microsoft Office which they had paid for separately. Cool.

I immediately deleted all of the bloatware and proceeded to bring the system back up-to-date with Windows Update. Here, I made a very critical error! Early on, I allowed Windows XP SP3 to be installed by Windows Update. While it completed correctly, after it was installed, other critical updates could not proceed (including updates to IE 7). I’m guessing that the system was simply in an inconsistent state.

I was tempted to start all over, and reinstall XP from the recovery partition again. Then I decided to try something I’ve been aware of, but have never attempted. I picked a Restore Point from the Help and Support menu in Windows, and picked the restore point that was automatically set before SP3 was installed. In effect, I was rolling back the SP3 install.

That worked, and I was left with only one program that I had previously deleted to delete again. I then forced a new save, so that I could roll back again (without needing to delete this extra program), just in case.

I then performed all of the updates in XP, each time skipping SP3. Only when SP3 was the only update left did I allow it to install. Perfect. I then ran the program that always provoked the memory error (this time with 2GB of RAM back in the sytem!). It ran flawlessly. I did a bunch of other things, and never got the BSOD! Joy!

So, it turns out that it was indeed a deep software problem, and reinstalling Windows fixed it. He’s now got a very nice sytem, with a nice memory upgrade, clean of anyone elses stuff (family and malware included).

After getting it running, I put in the Bart PE CD again (this time not booting from it, just inserting it while logged in), and ran DriveImage XML again. I was able to pull over all of his files (Documents, Settings, etc.) and check what programs were installed as well.

He’s up and running now and I returned the machine to him yesterday afternoon. I had the machine in my possession for roughly 26 hours, but it was side-by-side with mine, and I worked on Friday, played poker on Saturday, etc., without losing a beat on my own stuff. Pretty good result all around! 🙂

Recovering From a Server Disk Crash

Send to Kindle

The machine that is delivering this blog to you is a standalone Dell server, running CentOS 5.2. It resides in a data center managed by Zope Corporation (ZC) Systems Administrators (SAs). I perform the majority of the software administration and maintenance myself, and they maintain all of the hardware along with some of the software.

The single most important software maintenance that they are responsible for is backing up the system. We’ll get to that in a minute.

This past Sunday, we were in our usual hotel room. I was logged on for roughly eight straight hours, and all was well on this server. I shut my laptop off at around 10:15pm. When we woke up in the morning, Lois checked her email on her Treo. It kept hanging. We thought it was a Treo/Sprint problem, but even rebooting the Treo didn’t correct the hang.

When we got into the office (at around 7am), our laptops couldn’t retrieve mail from the server either. Pinging the server worked, but I couldn’t SSH on to it. In fact, most services delivered by this server were gone (including this blog, which was unreachable). The only service (aside from ping/ICMP) that was obviously up was Zope! If you just wanted to visit our corporate site, and learn about our VC business (all of which is handled by Zope), that was running fine!

The head of the SA group at ZC was also in that early (thankfully!). After poking around a bit, I encouraged him to power off the machine remotely, and them power it back up remotely. He did. Unfortunately, the machine didn’t come back up (or at least it wasn’t responding to pings any longer, so something was wrong).

Another SA was called and directed to go to the data center rather than the office. When he got there, and hooked up a console to the server, he saw that the disk was failing. Attempts to get it started (running fsck, etc.) proved fruitless. It decided to die completely that morning.

Yet another SA was dispatched (from the office) to the data center with a fresh disk (he had other work to perform at the data center, or I would have delivered the disk myself, happily!). We knew in advance that this disk might be slightly problematic, as it had CentOS 4.6 installed on it.

When the disk was inserted into the machine, it booted up immediately. I was able to successfully SSH to the machine, but of course, nothing of mine was on there. That’s when the original SA (who also happens to be our expert in backups) started restoring the machine from our backups.

For a long time, we used to run the Amanda backup program at ZC. I don’t know why, but our experience with Amanda was never good. I am not suggesting that others don’t find it to be a perfectly acceptable solution, but for us, for whatever reasons, it wasn’t good enough.

After searching and evaluating a number of alternatives, the ZC SAs selected Bacula. We’ve been using that for a reasonable period of time now. The Bacula restore of my machine didn’t take all that long, and every file that I wanted/needed was successfully and correctly restored. In fact, the nightly incremental backup had run successfully before the disk decided to die (how polite), so even some non-critical files that I touched on Sunday were successfully restored! Whew!

That said, it was hardly a trip to the candy store. All of the applications that I compiled myself (more than you’d think, I’m a geek at heart!), didn’t work because of the OS (operating system) mismatch. My programs required newer versions of various libraries (possibly even the kernel itself in some cases), so starting from a 4.6 machine and restoring files that required 5.2 wasn’t as clever as we’d hoped.

Still, with some pain, theoretically, one can ugrade a 4.6 machine to 5.2 over the network. That was my plan… Well, the best laid plans, as they say…

I released the SA from the task of baby-sitting my machine, because I knew that a network upgrade would take a while, at best. After doing some network magic, the machine was in a little bit of a funky state, but there was a chance that a reboot would bring the machine up in a 5.x state, making the rest of the upgrade fairly straightforward.

Unfortunately, when I rebooted, the machine hung (wouldn’t actually go away, still pingable, but otherwise non-responsive). Again I asked the head of the SA group to remotely power down/up. Again, it powered down properly, but didn’t come back up.

In fact, it likely did come up, but because of the funky state that I left the machine in, it couldn’t be seen publicly due to network configuration issues. This time, we decided to take a more conservative approach, because opticality.com was down for at least 8 hours already (not a happy situation for either Lois or me).

The original SA went back down to the data center. This time, he burned a CD with a network install ISO of CentOS 5.2. After installing the correct OS onto the machine, he again restored the disk with Bacula. This time, everything matched. Still, there were problems…

The biggest issue (by far!) was foolishness on my part in mapping out what I wanted backed up on the machine to begin with. Sparing you the gory details, I ended up restoring the Yum database from my backup over the actual Yum database from the installation, so the system didn’t really know what was installed and what wasn’t. Not a good thing.

I only really cared about email to begin with. I built a few things (pretty quickly) by hand, and got email running. Then I got the web stuff up pretty quickly too. Finally, IM. Those are my big three hot buttons, everything else could be dealt with later on.

I didn’t want the SA to leave until we could prove that the machine could be booted correctly remotely. That took some time as well, as a number of the services that are started automatically weren’t installed on the machine (though RPM/YUM thought they were!). We (or rather he, at the console) disabled them one by one, until the machine came up.

After I restored the more critical ones, we tested a reboot again, and it came up fine. Whew. I released him again, this time for the last time.

I cleaned up a few more things and went to bed reasonably happy (it was now close to 10pm on Monday night). Over the next two days, I spent a few hours cleaning up more things. Yesterday, I completed the cleanup.

A series of shell scripts and filters, doing things like the following:

yum list | grep installed | cut -d ‘ ‘ -f 1 > /tmp/installed

Then running the resulting packages (the ones the system thought were installed!) through:

rpm -V package-name

Filtering that output for lines starting with the word “missing”. Then removing those packages (they weren’t there anyway, so it was a database cleanup activity) and then installing them via Yum again. It wasn’t actually as painful as it sounds, but it wasn’t pain free either.

The biggest headache occurred when removing a non-existent package also moved a config file (that was being used by a package I built separately, so Yum was unaware of it). At one point yesterday, without realizing it, I killed our SMTP (email) server. Oops. We were down for about 10 minutes, before I realized it, and got it back up and running.

At this point, all is back to normal, and Yum correctly knows what’s on the system.

Here are the lessons:

  1. Back Up Regularly
  2. Have a great group of dedicated SAs behind you
  3. Have some spare parts close by
  4. Think hard before taking some stupid actions (all my fault!)

I’m truly amazed, pleased, impressed, etc. that we lost nothing. Of course, we were down for 12 hours, but Internet email is a truly resilient thing. All mail that wanted to be sent to us was deferred when our SMTP server didn’t answer the call. When we came back up, mail started to be delivered nearly instantaneously, and by morning, all mail had been correctly delivered.

Here’s hoping I don’t go through this experience again, and here’s hoping that if you do, this post might help a bit… 🙂

NginX Upgraded In Place

Send to Kindle

NginX 0.7.2 is now available. I was running an ancient version, 0.7.1 (just kidding). 😉

So, I downloaded and built 0.7.2. Then I followed the instructions at the bottom of this page, which explain how to upgrade on the fly. It’s even cooler than just not losing any ongoing connections (which would be cool enough!). It also can be rolled back just as easily, if the new server proves to be faulty.

This is just too cool! Anyway, welcome NginX 0.7.2, RIP NginX 0.7.1.

WordPress Ate My Posting Date

Send to Kindle

This post is really a test, but I’ll share the full problem and solution (assuming the test works), so it might help someone else out there as well. 🙂

The other day, for the first time ever, I noticed that a post of mine hit my RSS reader (Thunderbird) with a date of 11/29/1999.  I was busy with other things and decided to come back to it. Of course, I didn’t…

The next time I posted, the date was correct again. I (foolishly) assumed that the problem was transient (and possibly corrected). A few more good posts, then yesterday, back-to-back bad posts. I realized exactly what was different between the good posts and the bad ones. At least procrastinating paid off this time. 🙂

All of my good posts were made with Windows Live Writer locally on my laptop and then saved to the server as a draft for continued editing. All of my bad posts were started and finished directly in the web interface of WordPress 2.5.1.

This certainly wasn’t a native 2.5.1 problem, because I’ve been running 2.5.1 a lot longer than my first bad RSS pull and until reasonably recently, I posted 100% of the time through the web interface. I selected the last few rows from the wp_posts table and spotted the problem instantly.

The post_date_gmt had the following value in my three bad posts: 0000-00-00. A Google search revealed this forum thread discussing the problem. Some people seemed to think it was related to the installation of XCache (others thought that you also needed a separate plugin installed to provoke the problem). Since I recently installed XCache (as reported in my NginX post), I was more than willing to believe that this was my problem. Of course, I didn’t see how it worked when Windows Live Writer was posting, but it was certainly possible that it took a different code path…

There was a pointer to a patch in that thread. The person who posted the link also commented that he thought this was not the “right” fix, though he said it worked. I’m not sure why he felt that way. Aside from being innocuous (meaning, there was no way that this could hurt anything), it seemed like a reasonable fix and was already in the trunk for WordPress 2.5.2, so it also seemed safe to apply, given that it would soon be production WordPress anyway.

So, I downloaded and applied the patch. I won’t know until I click on Publish whether it will work, as the post_date_gmt column correctly remains at 0000-00-00 while I’m in Draft mode. If it works, I won’t update this post. If it fails, I’ll come back and add another paragraph to let you know it didn’t work…

Update: I decided to add this bit quickly. It worked (so that’s not why I’m back here). I thought of one more possibility for the cause, which has nothing to do with XCache. Recently, I noticed that my posting times were in EST, not EDT (meaning, the one hour time-zone change didn’t happen automatically). I went into the WP Admin interface and adjusted the setting to be GMT-4 instead of GMT-5. It’s possible that this change caused the problem as well, but I really have no idea…

Firefox 3.0 is Awesome

Send to Kindle

Yesterday, I took a (cheap) shot at the Firefox launch fiasco. I didn’t need to do it, but I don’t really regret it either. You can’t have it both ways. You can’t want to hype the hell out of the launch, and then be unprepared for it.

Anyway, even though the official launch was delayed by 76 minutes, the US suffered an outage that lasted at least 30 minutes longer.

It doesn’t matter. They are blowing the numbers off the screen and there are still more than four hours left to download Firefox.

As I predicted yesterday, I accounted for four legitimate downloads, two yesterday (for Lois and me) and two today for the ancient guest machines.

It took me a while to be sure that Firefox 3 was faster than Firefox 2, but I’m convinced it is now. There is one site that I visit typically once a day, that takes a very long time to paint/load. It loads substantially faster in Firefox 3.

That’s not why I titled this Firefox 3.0 is Awesome. If you know Lois, then you know that she’s effectively legally blind. We run her laptop at a bizarre resolution, and then still have to pump the DPI (dots per inch) up, so that everything is hideously large on her screen. I can’t even begin to tell you what that does to any reasonably complex web page.

Aside from the fact that very little of the page typically fits on Lois’ 17″ monitor, most complex web layouts overlap the text, including hyperlinks (for her), making it near impossible to click on some links (you simply can’t get the mouse over the link!), and many of the pages are nearly unreadable as well. This has been true (for years!) on both Firefox (all versions) and IE (all versions).

Enter Firefox 3.0! Lois’ default home page is Google News. The minute we launched Firefox 3.0, we saw the page laid out perfectly. Unfortunately, it was gigantic, so the scroll bars were there for horizontal and vertical scrolling. It was too large, so we went in to the preferences, and saw that her Font was set to 28! (I told you, we have had to tinker lots just to make her be able to see!)

We dropped the Font size to 18 and reloaded. It got a drop better, but not much. Another look at the preference panel and we realized that we needed to go into the Advanced settings. In there, we saw that Lois had a minimum set of 24. We changed that down to 18 as well, and the page now showed off perfectly for her.

Then we went to our bank site, where Lois always has trouble hitting certain links. It painted perfectly (albeit with scroll bars). We played with Ctrl-Minus (Ctrl -) and the page shrunk perfectly. Previously, that key combination might have shrunk the text, but the rest of the layout often got worse. Now, Ctrl-Minus and Ctrl-Plus (Ctrl +) work flawlessly. Yippee!

Finally, I didn’t care about most of the addons that weren’t Firefox 3.0 ready. One I really cared about, TinyURL Creator and one I like (but could live without) Linkification. Rather than disabling version checking (dangerous at best), I decided to hand-tweak these two since neither is sophisticated, and therefore unlikely to fail or cause harm.

In both cases, I found which directory they were installed in under the extensions directory in my Firefox Profile (this is on Windows). I then edited the install.rdf file, and found the max version number that they would run in, changing it from 2 to 3, and voila (after restarting Firefox), both work beautifully again.

Welcome Firefox 3.0, we’re both very glad to have you on board! 🙂

Firefox 3.0 Download Day

Send to Kindle

Well, it’s finally here. I was all set to do my part with helping them attain the nonsensical world record download numbers.

Both Lois and I use Firefox as our primary browsers, and I have an ancient laptop that I occasionally use in a hotel room that I was going to upgrade, and one other one that I sometimes use as a guest machine. So, I was about to legitimately download it four times (I hate all manners of ballot stuffing, so I wouldn’t download multiple times just to up the count!).

Lo and behold, 10am PDT came, and the Mozilla site is 100% unresponsive. Way to plan for a world record!

Fully 15 minutes later, when you would think they would have been aware of the problem and would have switched to Plan B (they did have a Plan B, no?!?), the site is still pathetically slow (at least it finally loads!), but amazingly, it still only offers the link to Firefox 2.0.0.14!

How embarrassing…

I’ll still download it when I can, because I love the product, and everything I’ve read about version 3.0 makes me want it, so I have no desire to punish them, or cut my nose off to spite my face, but this one definitely goes in the pile of monumental screw-ups…

Getting Very Tired of AlertThingy

Send to Kindle

I’ve been using AlertThingy for a while now. I like the concept a lot. It started out as an Adobe Air front-end to FriendFeed.

Let’s begin (briefly, at least brief by my normal standards) 😉 with FriendFeed itself. At a minimum, FriendFeed is a social media aggregator. You can have it create a feed for you, with many (currently 35!) different services (e.g., Twitter, blogs, Google Reader, iLike, Facebook, etc.). Now, if someone wants to truly follow you, they can become your friend in one place, and not have to have accounts and joint friendship with you on all of the different services that you belong to (or will in the future!).

Great idea, and for my limited use so far, pretty darn good execution as well. I’m definitely a FriendFeed fan.

There are a number of ways to consume FriendFeed, including just by visiting their website only when you want to see what your friends are up to. You can also get updates via email. Because FriendFeed is part of the wonderful trend of services that provide full APIs, you can also use other clients to access FriendFeed.

AlertThingy started out as one such client. While it has a nice UI (at least a reasonable one), one instant quibble is that I can’t find a way to quit the program (perhaps I’m just dense). I have to go to the tray, right click on it, and select “Exit AlertThingy”. Yuck. Also, on the settings page, I have checked off the Launch at Startup box, and yet, it continues to launch every time I start up Windows. 🙁

So, let’s start with the good. Instead of having to log on to the FriendFeed website, and refresh every once in a while, AlertThingy will instantly alert me to anything that the people I’m following have to say on any of their services. Cool. Perfect? No.

The first problem is that not everyone that I follow has a FriendFeed account (or I haven’t bothered to look for them there, etc.). So, in addition to following some people on FriendFeed, I still have to check (in any number of ways) Twitter (for example), for those people whose tweets I’m interested in. Of course, if I launch a Twitter client (Twhirl is fantastic, also written in Adobe Air), I’ll see the tweets I’m interested in. But, I’ll also get an alert in AlertThingy as well, for the same tweet, if the person is also connected to me on FriendFeed as well.

That’s not the end of the world, but it’s not pretty (or conducive to managing interruptions) either. Now multiply the problem for every service you check (beyond the one Twitter example above), and you can see that it could get out of hand quickly.

Of course, this is not an AlertThingy problem, just a social media proliferation one, where the FriendFeed aggregation is getting in the way, by duplicating things. Of course, in this particular instance, if everyone was on FriendFeed (well, at least everyone that I cared about), that one part of the problem wouldn’t exist.

Now on to my real complaint, and why I’m getting tired of AlertThingy. AlertThingy was not satisfied in solely being a front end to FriendFeed. Since other services have APIs too, they started (a while ago) offering direct Twitter integration (Flickr too, and probably more coming).

Sounds great at first. I don’t have to launch Twhirl any longer (for example), since now I can see tweets from non-FriendFeed people directly in AlertThingy. Of course, when you think about it, it’s not a great idea, because it becomes it’s own sort of FriendFeed, while still providing FriendFeed… Ah, so there will be duplicates for people who are on FriendFeed, no? No! AlertThingy cleverly (yes, italics are there for sarcasm) removes duplicates (at your request).

No, they do not do it cleverly. They do a few things wrong, including alerting you multiple times. First, you get an alert when the real tweet comes through. Then, you get another alert when the FriendFeed broadcast of the same tweet comes in, even though AlertThingy only shows it once. But, that’s not the real problem at all. When they de-dupe, they totally screw up the timestamp. One or the other message gets timestamped with Greenwich Mean Time rather than local time.

For me, that means that these alerts get sorted to the top. Then, a new alert comes in on FriendFeed only (say a Google Reader share), which never gets duped, so it has the correct timestamp. That will be sorted below the de-duped stuff. Possibly, even pages below, since I’m five hours behind GMT!

This one super-annoyance is maddeningly easy to fix, and I sent feedback to the author on his site (and never heard back). All one needs to do is never accept a timestamp in the future (dupe or otherwise). If a message is about to get timestamped (and sorted), it should be the lesser of now and whatever timestamp the message claims it is. Simple.

So, what happens to me is that AlertThingy makes a noise and flashes, and then I have to look through a very long list of alerts to see which one is new, and it might be 20 down from the top. I can’t stand it any longer…

What makes it really bad for me is that I follow one particular person who is a prolific social media person (it’s part of his job, so I’m not blaming him). Today alone, he has created 43 separate alerts that are still visible in AlertThingy (others have already scrolled off the bottom and have been archived). It’s not only 43 new things. When he blogs, I get a blog alert, a tweet pointing to his blog, a Tumblr alert, etc. That’s a FriendFeed problem, as he needs to ensure that people see his stuff no matter where he posts it, and he too can’t be sure those people know about his FriendFeed.

He’s not uninteresting, and he’s one of the nicest people I know. That said, I’m dying to unsubscribe from him, but I don’t want to hurt his feelings. Since he’ll likely read this (and know who he is), perhaps I can get away with unsubscribing now. Especially today, since he started using Digg beyond belief, and his FriendFeed automatically picks them up. His acceleration of alerts is killing me.

So, I’ll probably both ubsubscribe from him, and also stop using AlertThingy, and start checking FriendFeed on the web (less frequently) or via a once a day email. I’ll hear about things later than I would have (other than tweets, which I’ll likely still follow via Twhirl), but my sanity will return…

NginX Reporting for Duty

Send to Kindle

Last week, my friend Jamie Thingelstad tweeted from the Rails Conference that he was considering switching to NginX (I’ll probably drop the caps starting now) after sitting in on a session about it.

Prior to that mention, I had only heard of it once, when an internal tech email at Zope Corporation mentioned that one of our partners was happy with it as their web server. I didn’t pay too much attention to it at the time…

I was curious, so I downloaded and built nginx-0.7.1. Since there is no equivalent to mod_php for nginx, I also had to decide on a strategy for managing PHP via FastCGI for this blog (I use Zope to serve up all other publicly available content on the site). It seemed that the most prevalent solution out there was to use spawn-fcgi, part of the lighttpd web server package (another very popular alternative to Apache, which is what I was using).

After searching further, I noticed that a number of people were recommending a package called php-fpm (PHP FastCGI Process Manager). I decided that I’d give this one a try first.

Finally, while researching all of this, in particular noting results by WordPress users (after all, that was my one use case for doing this), I also decided to install an op-code cacher for the first time ever, settling on XCache (also brought to you by the fine folks who bring you Lighttpd!).

Within minutes I had nginx installed and running, serving content on an alternate port (so Apache was still serving all production content on port 80). Shortly thereafter I had php-fpm installed and running as well. XCache installed easily too, but it did take me a few minutes longer to correctly change one particular entry in the conf file.

I had some serious frustrations along the way, all of which were due to a fundamental lack of understanding on my part of how things are ultimately supposed to work (meaning, nothing was wrong with the software, just with my configuration of it!), but to underscore the bottom line, nginx/php-fpm/xcache are all in production now. If you’re reading this in your RSS reader, or in a browser, it was delivered to you by that combination of software (unless this is far in the future, and I’ve changed again). 😉

If you aren’t interested in the twists and turns of getting this into production, then read no further, as the above tells the story. If you care to know what pitfalls I encountered, and what I had to do to overcome them, read on…

Getting everything set up to generically serve content on the alternate port was trivial. Like I said earlier, I was serving test pages in minutes (even before downloading php-fpm and xcache). In that regard, you have to love the simplicity of nginx!

With extremely little effort, I was able to write a single line proxy_pass rule in nginx to serve up all of my Zope content on the alternate port. Done! I figured that this was going to be really easy, since the WordPress/PHP stuff has so many more real world examples documented. Oops, I was wrong.

To begin with, the main reason that I was wrong is due specifically to WordPress (hereinafter, WP), and not nginx or PHP whatsoever. Basically, I ran into the same problem the first time I fired up XAMPP on my laptop to run a local copy of my production WP installation.

WP stores the canonical name of blog in the database, and returns it on every request. This means that your browser automatically gets redirects, which at a minimum (in the nginx case) dropped the alternate port, and in the XAMPP case, replaced localhost with www.opticality.com. The fix for XAMPP was easy (yet still annoying!). I had to go into the MySQL database, and change all references to www.opticality.com to localhost.

In the case of nginx, I couldn’t do that, since I was testing on my live production server. In retrospect, this was stupid (and cost me hours of extra time), since I have a test server I could have installed it on, run it on port 80, change the blog name, etc., and been sure that I had everything right before copying over the config. Of course, I learned a ton more by keeping my feet to the fire, especially when my public site was hosed (which it was a number of times this weekend!), perhaps shortening the overall learning experience due to necessity! 🙁

That made testing WP extremely painful. Instead, I tested some standalone php scripts, and some other php-based systems that I have installed on the server, but don’t typically activate. All worked perfectly (including my ability to watch xcache return hits and misses from the op-code cache), so I was growing in my confidence.

Sometime late Wednesday night I was mentally prepared to cut over (since switching back to Apache would only take a few seconds anyway), but I decided to wait until the weekend, when traffic to this blog is nearly non-existent (come on folks, you can do something about that!). 😉

Saturday morning, very early, I stopped Apache, reloaded nginx with a configuration that pointed to port 80, and held my breath. I was amazed that everything seemed to be working perfectly, on the first shot! I moved on to other things.

Later in the day, I visited the admin interface for WP (Firefox is my default browser, though I obviously have IE installed as well, and for yucks, I also have Safari, which I very rarely launch). Oh oh, I had zero styling. Clearly, something was causing the CSS not to be applied! I quickly ran back to the retail interface and thankfully, that was still being served up correctly. That meant that I could take my time to fix this problem, as it only affected me!

I turned on debugging in nginx and saw that the CSS was indeed being sent to the browser. That’s not good either, as it seemed that this wasn’t going to be a simple one-line fix in nginx.

Yesterday morning, after nginx had been in production for 24 hours, I decided to check out IE for the admin pages. They showed correctly! Firefox was still showing zero styling. I compared the source downloads, and Firefox was larger, but had everything that IE had as well (including the CSS). At first I incorrectly assumed that perhaps it was the extra ie.css that IE was applying, and Firefox was ignoring, but that really couldn’t have been it, and I didn’t waste any time on that.

Now that I had a data point, I started Googling. It took me quite a while to find a relevant hit (which still surprises me!), and I had to get pretty creative in my search terms. Finally, one solitary forum hit for a php-based product called Etomite CMS (I had never heard of it before) describing exactly what was happening to me. Here’s the link to the forum post.

OK, their take was that the CSS was coming down with a MIME type of text/html, rather than the correct: text/css. I went back to Firefox, and did what I should have done at the beginning (my only excuse is that I’ve never had this type of use case to debug before!). I enabled Firebug on the admin page (I already had Firebug installed, but I’ve rarely ever needed it).

Immediately, I could see that the CSS had been downloaded (I knew it was sent from the nginx logs, but now I knew it was received and processed). I could also see instantly that the CSS was not recognized as CSS. Going to the Net tab in Firebug revealed the headers for each individual request, and indeed, all of the CSS came back with text/html as the type. So, IE ignores that, and applies the CSS anyway, because it sees the extension (I’m guessing), but Firefox doesn’t. Fair enough.

But, why is it coming back text/html? I checked the nginx MIME types file, and CSS was coded correctly (I didn’t touch that file anyway). Looking at the headers again, I spotted the problem. All of the CSS files were being returned by PHP, not statically by nginx. In other words, my nginx config was wrong (even though it was otherwise working), as it was passing requests for CSS files to the FastCGI PHP process, and getting back an incorrect MIME type for those files!

A very cursory search didn’t reveal how I would tell PHP to return a correct MIME type. I’ll save that investigation for another day, as it would have been an immediate, but suboptimal solution anyway, since these files can be served faster by nginx to begin with. That meant figuring out what was wrong with my nginx configuration.

The docs for nginx are pretty good (with a few exceptions, but I’m not quibbling). There are also a ton of cookbook examples, some good and detailed, some should be deleted until they are completed. I realized pretty quickly that even though my retail interface was working, I had less of an understanding of how the different stanzas in my configuration file were interacting (specifically, the location directives).

I really didn’t want to revert to Apache, so I decided to experiment on the live server. It was Sunday evening, and I figured traffic would be relatively light. Sparing you (only a drop, given how long this is already), I hosed the server multiple times. I could see that search bot requests were getting errors, and my tests were returning a variety of errors as well. At one point, I had nearly everything working, and then promptly screwed it up worse than before. 🙁

Eventually, under a lot of pressure (put on myself only by me), I simplified my configuration file, and it all started working, retail and admin alike. Whew. I’m now serving all static files directly from the file system, and all php files through the FastCGI process. The world is safe for blog readers again. 🙂

Like I said above, the problem was never with nginx, though I can’t say the same thing for PHP returning text/html for a CSS file. Clearly, PHP shouldn’t have to process those files, but if it does, it should do the right thing in setting the MIME type. Of course, if it had, I would likely have never noticed that my static files weren’t being served as I intended them!

One real complaint, aimed at nginx. Since I use the proxy_pass directive to serve up all Zope content (and like I said, that worked from the first shot, and has worked ever since!), I was quite frustrated by the limitations on where it can be used, and how. Specifically, it can’t be used within a regular expression based if test, even if you aren’t using variable substitution in the proxy_pass itself. That simply doesn’t make sense to me. And, now that I mentioned it, there are other rules about when/how you can use variable substitution as well.

Clearly, I got it working, so the above isn’t a giant complaint, it just eludes my current understanding. As for getting the rest to work, I have a better understanding than I did, but I’m sure that if I were to muck around a little more, which I’ll inevitably do, I’ll break it again, showing that my understanding is still at the beginner level.

In any case, for the moment, it’s doing what I intended, and is in production. 🙂

Aside from Jamie’s pointer, in doing the initial research, what got me excited was reading that WordPress.com had switched to nginx for their load balancing (and might eventually switch for their web serving as well), and that Fastmail is using nginx for the IMAP/POP mail proxying as well. Clearly, nginx is ready for prime time (my puny blog notwithstanding). 🙂