I’ve taken the leap into learning some of the more obscure programming languages out there. OCaml, Haskell, Scheme, and Lisp. The point is not to use these for work or in production systems, but simply to make my head hurt. Using different languages gives you different ways to say things. Sometimes the new ways are better other times not. But mostly it’s just fun to dig in and figure things out.
I was trying to come up with a simple standalone application that I could use to try out each of the different languages and decided to do a network based counting server. It would have a very simple api to add counts to a channel and then to read back those counts by channel. My first pass last night was a little frustrating with Haskell. Reading through a couple books on the language and not seeing any mention of any networking support. Going to have to dig a bit deeper.
So there was some news announced on monday that will change the way that I work. My current preferred setup is to work on two laptops. My primary is a T42 thinkpad with linux installed. In addition to all of the normal development software, I run VMware with half a dozen Windows virtual machines. Different setups of Win2k and WinXP for testing software. My secondary laptop is a 12″ Mac iBook which I use for all communication type things. I’ve got a tricked out Mail along with IM client, IRC, Skype, and Calendar.
So the big announcement was that VMware is going to support OSX as both a host and guest VM on Intel machines. This will allow me to drop from two laptops to one. A single MacBook with a whole lot of memory to run all of my Windows and Linux VMs on one box.
With my personal search engine Cloudgrove, I’ve come across an interesting problem with FeedBurner. There are quite a few major sites that use FeedBurner to manage their RSS feeds, but every so often they will change the URL for the posts in the feed, usually by adding a “.” to the end. The problem with this is that I’m currently keying off of the url for the posts as a unique identifier, so by having FeedBurner changing the url I’m getting duplicates.
This is an easy enough problem to solve in the short run, but it concerns me a bit. What makes a post unique? Many sites will append params to the url to display the way in which a user came to the site, such as ?source=rss. However, almost all sites use params in some form or another in the url. So, removing params is immediately out for determining unique posts. I’ve seen in the Google documentation for some of their products that they remove session id params and other such transient pieces. I’m curious how they go about this. Are they just removing things with common names for sessions and cache busting?
I realize that Google is going much more indepth in comparing the similarity of pages. They have been fighting a long and hard battle against link farms and rooting out identical pages is a key part of this. My intentions with Cloudgrove though are to focus more on quality content sources and bypass the spam issues.
After doing some digging and going back through the logs it appears that the issue with the FeedBurner feeds was limited to Tue, 11 Jul 2006. However it appears that since then, they’ve changed the format for the urls in their feeds. They made an attempt to make a clean cut between old format and new format, but there was some leakage that also resulted in duplicates.
While working on some image uploading code for the YouService webtop today I came across an interesting tidbit. It turns out that you can’t load a local image file from a webpage. We’d like to be able to show a preview of an image before it’s uploaded by doing:
But it turns out that Firefox throws a security expception when you try and do this. It works just fine with IE, so this is sending us back to the drawing boards for a bit.