My Own Search Engine

I guess every software engineer should be writing their own competitors to Google and Facebook in their garage (in a future post I’ll include pics of our garage data center).  Because of issues I saw with Facebook, I created ReadPath. A social network with more of a focus on privacy and news sharing. It’s currently about 70% done. UI still needs lots of tweaks and there are some features that need to be completed.

One nice bonus of running ReadPath, is that it is constantly spidering content from RSS feeds for the news reader. The other day I realized that I’ve now stored a full billion content items going back several years. So of course having that much content I had to create a search engine to mine it. So, I created MiniSearch to play with different concepts involved in running a general search engine. There are a lot of things that turned out to be a lot harder than expected.

Currently the index is in the process of being built and only includes 20% of available content. There is also a lot of work to be done with ranking still. I’ll post again when I think it’s in a more usable state.

Playing with ESXi

I had to test out a desktop virtualization product (Pano Logic) this week and as part of the installation I needed a VMware ESX base system. I’m a huge user of their Workstation product, but I had never used the ESX line since it used to be so expensive and required certified hardware. Things have changed though and it’s now possible to download a copy of ESXi for free and to run without a dedicated SAN.

One of the difficulties with VMware is that their acronyms can be very difficult to wade through. ESXi is what they refer to as a hypervisor. This essentially is a very cut down operating system that is designed to only run other Virtual Machines. There are some requirements to running ESXi, I had to go through 3-4 servers before I found one that the installer had all of the drivers. I finally got it to run a server I had picked up from Penguin Computing (2x dual core Opteron with 4Gb mem and 250Gb hard drive).

Once I found a server that worked, the system installed quickly. The next problem was that you need to download the vSphere client to administer the server which is windows only (there are command line clients for other operating systems, but I wasn’t ready for that yet). I didn’t have a windows box laying around (all linux and mac), so I had to launch a WinXP VM in workstation on my linux desktop to administer my ESXi server. Amazingly everything worked great.

The next issue that I ran into was that I already had a large number of VMs created that I was using on workstation, but I couldn’t see how to get them on to the ESXi server. In the vSphere client there are clear instructions on how to create a new VM or download an appliance, but not how to import an existing VM. It turns out that VMware has a very simple way of doing this using the VMware Converter. This product works as a switchboard allowing you to convert or move VMs from one place to another, a really handy tool.

Overall ESXi is a great tool for running a whole bunch of server VMs. VMware offers a huge number of management products in the vSphere product line for managing load and moving VMs in a datacenter. But if you just need to run a few VMs on a single server I would definitely recommend looking at ESXi.

Just ordered a new kindle

I just placed a pre-order for Amazon’s newest Kindle the other day. This one is to replace Jaimie’s Kindle since she was still using the very first generation model and there are quite a few updates in this newest version. Supposedly much better battery life, better contrast, and faster page turns, all great things for a power reader.

We also decided to go with the wifi only model, which is cheaper, but most of our book reading is at home where there is total coverage. If we’re out and about we read on the iPhone and then sync back to the kindle to continue reading at home.

Hopefully this latest version will arrive before the next round of books that we’ve been waiting for. I’m waiting for The Evolutionary Void and we’re very happy that MockingJay will also be released in a kindle version.

I was happy to hear that Bezos was focusing on creating the best book reader instead of chasing yet another tablet.

Trusting Facebook

There’s been a lot of discussion on the web over the last several weeks on how much trust we can put into Facebook when it comes to handling private data. They’re making a play to be the primary repository of identity on the web. The hub that other web sites link off of to determine personal connections and demographics for a user. Facebook already has a huge lead in this area with 400+ million users that are using actual names instead of screen names.

It would be nice to have a place where we can set up who our friends and coworkers are as well as what we’d like to share with them. Having to recreate this network each time we want to use a new site is a complete pain. But who can we trust to store this valuable info?

Leo Laporte has mentioned in his podcasts that for some reason he just doesn’t trust Facebook as much as he would trust a company like Google to fill this role. I agree wholeheartedly that it’s risky to trust Facebook. For me, the core of this mistrust is that I feel that Facebook hasn’t yet found its truly profitable niche yet like Google has. Google makes so much money in search ads that it can afford to not make money in other areas and take the high ground when it comes to privacy and openness. Facebook doesn’t have this profit center yet to support the other areas of its business. The scary part is that the data that Facebook collects could be quite valuable. It’s really going to come down to where they decide to draw the line on how to use our data. And because this is still unknown and they’ve taken several missteps in the past, it’s difficult to really trust Facebook.