Friday, February 1, 2008

KISS - Keep It Simple Stupid

Perhaps not being a prime example of K.I.S.S, but I made my own Atom parser for the Blogspot feeds in Perl. Having my website under a webhotel, I do not have proper rights to install perl modules directly to Perl module repository, I often find myself fighting over an installation of CPAN module - despite the fact I have a "private" module repository myself. Most of the stuff install there pretty well, after having configured CPAN install properly, but Atom feedreader XML::Atom just did not build OK in my system. It derives a twenty-something subpackages and somewhere along the route, one of them just fails.

So I dug a little what kind of feed the Blogspot Atom feed really is. I found out its pretty simple one, if you just wanna regex through it fast and simple. So here is how it works:

1) Grab the feed with LWP::Simple
2) Run two regexes globally:

$feed =~ m%<published>(\d{4,4}\-\d{2,2}\-\d{2,2}).+?</published><updated>.+?</updated>%g;


$feed =~ m%<title type='text'>(.+?)</title><summary type='text'>(.+?)</summary><link rel='alternate' type='text/html' href='(.+?)' title='.+?'/><link rel='replies' type='text/html' href='.+?postID=([0-9]+)' title='(\d+) Comments'/>%g;

3) ???
4) Profit!!

Not sure if this is any help to anyone, but works great for me! KISSes to everyone!