The EBN machine is up again. Will Stephenson pointed out how the API documentation can be generated locally which takes away one reason for the machine (because the machine serves up api.kde.org). The value of the on-line version comes from multiple versions, indexing, searchability (although the PHP stuff that drives the search is rather poor, so that’s been on the stuff-to-fix-sometime for a long time), and saving you time in generating the whole shebang. The scripts and styling get occasional tweaks, too, to improve the API documentation. As always comments are welcome, patches are welcomer.
The machine does some other things, such as serving up my personal website (not updated in a gazillion years since I blog on FSFE’s Fellowship blogging platform) and that of Sebastian Kügler.
The main CPU load on the machine is running Krazy, the KDE code quality checker. It still produces lots of warnings and small items to fix in the KDE codebase. Sometimes I see people committing whole bunches of Krazy fixes. I recently saw it referred to as “KDE’s level grind”, which is a pretty good description in the sense that it was originally intended to find and explain the kind of low-hanging fruit you can fix on a Friday afternoon.
One last thing the machine does is provide the OpenSolaris packages for KDE4 as well. For this I ported pkg.depotd(8) to FreeBSD some time ago, but it’s starting to show its age and I think I’ll have to start running an actual OpenSolaris on the machine at some point. That means fiddling around with the available virtualization options and possibly updating and rebooting the machine repeatedly. If VirtualBox gives any measure of performance under FreeBSD, that’ll be it (because that simplifies updating from home where I also use it).
So, you may ask: what was the cause of this extended downtime? How can we prevent this from happening again? Well, one thing to fix it would be donating a 1U 4×3.5″ SATA disk server chassis with redundant power. Right now — thanks to being hosted in the “random crap” rack — there’s a little mini-tower thing doing the job, and it turned out to have a dodgy power cord. Some shifting and pushing-around of cables in the rack caused a momentary disconnect, which panicked the server and left it at a fsck(8) prompt for a while. This means, perhaps, that I should try to configure the system with “noauto” on the affected file systems, so that it comes up with reduced functionality even if the disk array is toast. Two other really important points: configure serial console support so that the ILOM can get at it and remember the ILOM password. Cue jumpering the server to reset the password and all the Fun that entails.
Anyway, it comes down to a few decisions made three years ago caused this downtime to be longer than expected. It’s up again, decisions have been amended, and we’re good to go for another three, I hope.