Archive for the 'Computers' Category
RSS-Enabling Your Buddy’s Non-RSS Enabled Blog
So as it turns out, I’m fully enamored with Google Reader, which allows me to read all sorts of RSS feeds. I waste time much more efficiently now. They’ve even got a version that’s optimized for your mobile device, so you can piss off your wife by reading blogs on your Blackberry.
The problem with feed readers, though, is that once you start using them, you stop reading blogs that don’t provide feeds. After realizing that I was falling behind on Heck’s Kitchen — which is hand coded, and therefore does not offer a feed — I realized that the best solution available was to write a screen scraper that would convert the static HTML page to RSS. Thankfully, Jenny writes good HTML and uses CSS classes and ids, so scraping the page was easy.
The fruits of my labor are here: Heck’s Kitchen RSS Feed.
Full script after the jump.
Read more
New York Times Crossword Puzzle Downloader
If you’re like me, you’re a Will Shortz junkie. You’ve several of his crossword books, and even remember fondly his time at Games magazine. You’ve also got a subscription to the New York Times crossword puzzle, but you don’t remember to download the puzzle on a daily basis. So you’re like, “Hey, I wish the puzzle would download to my computer every day, or, better yet, I wish it would show up in my inbox.”
I wrote a small perl script that does just that — it downloads the current puzzle, saves it where you want it, and emails it to addresses you specify. It’ll work best for OS X and Linux users, as it’s written in perl and requires a few external but fairly standard libraries that can be installed via CPAN. Download it here: NY Times Crossword Puzzle Downloader.
Short Directions:
Rename the file from .txt to .pl, make it executable, customize the variables, and add it as a cronjob.
Longer Directions:
…When I get a few minutes.
License: GPL, of course.
No commentsCan’t Flock, Won’t Flock
I’ve spent part of this morning tooling around with Flock, a Firefox variant with improved Web 2.0 support. (Yeah, I said “Web 2.0″. This post is going to be like that. Deal with it.)
Flock is basically a browser built around providing a better, integrated wrapper to blog authorship, link management via del.iciou.us, photo management via flickr, and presumably anything else that provides a public web service/XML-RPC API.
Read more
Installing DSPAM on Dreamhost ( Nasty Kludge)
My favorite spam filter is, by far, DSPAM. It’s pure bayesian, so rather than relying on a bunch of blocked IPs and dirty words, it makes decisions based only on what you’ve previously tagged as spam. Crucially, this means that you still get the newsletters that you’ve opted in to.
I’ve deployed it as a SpamAssassin replacement with tremendous results. The problem with SpamAssassin’s bayesian filtering — maybe they’ve fixed this, I don’t know — is that it would, after a while, tag one or two spam messages as ham and then autolearn them as ham. That sort of behavior snowballs if uncorrected, and there wasn’t an easy way to correct that behavior in a sitewide deployment with POP users. So snowball it did, and pretty soon the fact that ~70% of our correctly addressed incoming mail (and well over 99% of all mail) was spam meant that people were getting a spam:ham ratio of about 1:1. At that point, email is unusable.
The thing that made DSPAM a strong replacement was that users can decide what is and what isn’t spam, so you’re bayes db doesn’t get poisoned. Using DSPAM, you can tag things as spam in one of two ways — either by forwarding emails on to a convenience address, or by dropping them into a folder set up for that purpose.
After getting the kinks out of the DSPAM deployment — you’d do well to have a separate mySQL server — it hums and runs with well over 99% accuracy with basically no intervention on the part of admins, and little intervention on the part of users.
5 commentsSubversion GUIs in Gentoo Linux
Subversion, the almost-drop-in-replacement for CVS, has really taken off as of late, which is great. It’s got all sorts of nice features (atomic commits, file moves, etc.), integrates with apache, and is generally well loved by us developers. At least those developers who spend a significant amount of time on the command line. Because, unfortunately, the GUIs have been a little stinker-doodle.
1 comment