Google Email Updates Update

Looks like there’s been an update to the Google Email Update application. I’m now getting about 20x the volume for a series of terms that I track. The system seems to be much more responsive to actual updates on the web. In the past I would actually question whether the system was working. I’m curious if this change has anything to do with updates or integration with Google’s Blog Search.

Google and YouTube, a marriage of necessity

It would appear that Google paid a huge price for a small company that doesn’t make any money. It makes you wonder what’s going on here.
There were some hints that YouTube had technology for detecting copyrighted works coming out, but I’m sure that’s not worth $1.6B.
YouTube has a huge user base, but that’s because it’s the easiest to use right now, no guarantee of future performance.
Google purchasing YouTube doesn’t make any sense unless you compare it to the situation where a competitor purchased it. Google already has their own video product, but it’s not anywhere as easy to use as YouTube’s. Google is not paying $1.6B to get the extra video capacity, but to keep it out of the hands of Yahoo, Microsoft, and AOL. It probably would have been cheaper for Google if YouTube had gone bust and just disappeared, but they couldn’t take that chance as it became more apparent that someone was going to pony up some money for the users.

Webapp or Thick Client?

This question seems to be one of the leading questions in software development right now. Whether to go with the classic software style of a thick client that is downloaded onto the customers computer or to go with a Webapp that is hosted centrally.
With the launch of Google’s Docs and Spreadsheets the competing methodologies are laid out clearly. Microsoft has Office, the epitomy of fat clients, while Google has their webapps. Not only are these different application styles, but different philosophies entirely. Microsoft offers every feature a document writing user could ever want. You could spend years digging into every little nook and cranny in office and still not use everything that it has to offer. Meanwhile, Google is taking a track that has been pushed by a new breed of designers like 37Signals. This philosophy is to just offer what most people need to get most of their work done and focus on making that subset as strong as possible. By keeping things simple you actually create a more usable product.
The reality is that there is room for both philosophies as users have different needs. What will be interesting to see is if Google can accomplish their goal with the webapp. Web based applications have a lot of advantages over fat clients, maintenance, support, and upgrades are all much simpler when working with a centralized application. The problem has always been with how much you can make a browser do and would the browser behave. IE, Firefox, and Safari all behave very differently in certain situations and writing an application that works on all of these platforms can be difficult.
One thing that Google has going for it is that there is a whole lot of work and excitement happening in browser based apps right now while there just isn’t anything new happening with fat clients. The network is a more important part of the computer every day and available bandwith has become the limiting factor with computers now that the cheapest entry level computer purchased at Walmart can handle most any user’s tasks.
Another advantage is that as the browser becomes more of a standard way to run applications, bits of functionality can be developed to be shared with different applications. The whole mashup idea is still a bit ahead of its time as early implementations of this were based on APIs that weren’t really meant to be used together. Companies were exposing bits of functionality without really thinking things through. Others were building applications on these APIs without concern for them being revoked at some point in the future. The relationships were often to only one parties advantage, everyone was just experimenting to try it out. Now that the idea has settled down a bit and business models can be built around the APIs they’re maturing to the point where they can be safely used.
The product that I’m currently working on, YOUnite, actually splits the difference between the two philosophies. We’ve created what we call a Webtop application. It’s a downloaded client that runs a embedded application server on the local desktop. The user then interacts with the application through their browser. There were several reasons technically to go this route, but it’s still yet to be seen if we’re ahead of the game on this type of application or creating more trouble than it’s worth by going this route.

Personalization

Netflix has an interesting little contest going on. The goal is to improve their cinematch software for making movie recommendations by 10%. The prize for winning the contest is $1M.
I’ve been working on personalization software for awhile now with my CloudGrove personal search engine. Now I’m looking forward to taking a look at the netflix setup to play around with some more data.

How movie downloads should work

Over the last year, I’ve definitely bought into the Apple product offerings hook, line, and sinker. The ipod dragged me in. Then I purchased an ibook, next an imac for the home, and now I’ve just upgraded to a macbook. I hate DRM, but for right now the Apple itunes DRM in permissive enough that it doesn’t stop me from doing anything that I would want to do. With the release of movies and tv shows on itunes there’s a whole host of new opportunities and problems.
The imac that I use at home to manage all of our digital content has 100Gb of disk. My wife and I sync our ipods against this with a collection of about 30Gb of music. There is already some concern in the back of my mind about what would happen if the disk in the imac died a sudden horrible death as disk drives have a tendency to do. To counter this threat, I have another server that I regularly back up to in the garage. Of course this solution doesn’t cover a house fire or other catastropy and is already more involved than the average computer user is going to get.
The problem is magnified even more with movies. Looking at itunes it appears that the average movie size is 1.5-2Gb. This would mean that I could buy ~40 movies. Beyond this point I’ve got to start looking into better ways to store the files, ie. home terabyte file server. The main issue being that since I’ve purchased the file I can’t really delete it. Deleting or losing the file is the same as throwing $12.99 away. It appears that itunes allows you to burn the files to disk to back them up. So, you could pay $12.99 for the movie, plus the amount for a blank dvd, plus the time it takes to burn each movie. Or, you could just buy the dvd off of amazon and stick it in the dvd player. Itunes isn’t really winning the argument here especially since it takes 2+ hours to download in the first place.
The better option is to rent the movies. With movies you don’t watch them over and over again the way you listen to music anyway. My proposal is a partnership between Tivo and Netflix, I had heard rumors that they were talking. Tivo already handles DRM, making it difficult for you to take the files and upload them to the internet. Netflix has the library and the low monthly charge. What you would do is to manage your Netflix queue and then instead of mailing dvds, your Tivo would download 4 movies in the middle of the night. When you’re finished watching the movie you just delete it off the Tivo. Once you delete the movie the Tivo starts to download the next one. This would provide all of the benefits with little of the downside. Netflix‘s operations would be vastly simplified. Consumers don’t have to care about storing and backing up massive digital files. Tivo cements it’s role as the gateway to your tv. Everyone but Microsoft and Apple are happy.

Produce something

I was watching a show on NPR this weekend where they had a little spotlight on Dean Kamen, the creator of the Segway. The segment showed him at work in his own little workshop tinkering with different projects. Of course the workshop that he has in his house is a little different than what anyone else might have. But the view of his workshop as well as a bit about his work with kids, developing programs to get them interested in science and engineering, got me thinking. I really liked his take on the world, that our society had taken the ingenuity out of the average person. The average person is to happy to just sit back and have things fed to them.
All too often we focus too much on consuming. Taking in what corporations produce, paying for goods and media. With the rise of service industries the average person doesn’t really produce anything for themselves anymore. Not that I’m advocating that we should go back making everything for ourselves from scratch but tinkering in an area that you find interesting and trying to make something better has a lot of benefits.
Far too many people don’t have the confidence in themselves that they could make something that someone else would want. There’s a whole lot of satisfaction to be had in having other people enjoy something that you’ve made. People are also afraid to start a project because they won’t be able to complete it. I think this attitude in particular holds a lot of poeple back from doing really interesting work. It’s not often that a product is created exactly as it was invisioned. There is usually a whole lot of tweaks and changes that need to occur before you get to a successful finish. The biggest obstacle to this whole problem is just starting.
Dean Kamen’s programs aim to combat these attitudes. To get kids doing things for themselves and finding that sometimes they can do it better.

Problems with Feedburner

With my personal search engine Cloudgrove, I’ve come across an interesting problem with FeedBurner. There are quite a few major sites that use FeedBurner to manage their RSS feeds, but every so often they will change the URL for the posts in the feed, usually by adding a “.” to the end. The problem with this is that I’m currently keying off of the url for the posts as a unique identifier, so by having FeedBurner changing the url I’m getting duplicates.
This is an easy enough problem to solve in the short run, but it concerns me a bit. What makes a post unique? Many sites will append params to the url to display the way in which a user came to the site, such as ?source=rss. However, almost all sites use params in some form or another in the url. So, removing params is immediately out for determining unique posts. I’ve seen in the Google documentation for some of their products that they remove session id params and other such transient pieces. I’m curious how they go about this. Are they just removing things with common names for sessions and cache busting?
I realize that Google is going much more indepth in comparing the similarity of pages. They have been fighting a long and hard battle against link farms and rooting out identical pages is a key part of this. My intentions with Cloudgrove though are to focus more on quality content sources and bypass the spam issues.
–Update–
After doing some digging and going back through the logs it appears that the issue with the FeedBurner feeds was limited to Tue, 11 Jul 2006. However it appears that since then, they’ve changed the format for the urls in their feeds. They made an attempt to make a clean cut between old format and new format, but there was some leakage that also resulted in duplicates.

How much documentation?

Working for a startup, there has been some pull back and forth on how much documentation is the right amount.
Over documentation is often a problem with large companies. So much process is required to get the most basic things done that it’s next to impossible to get anything out of the ordinary accomplished. This slows the pace of development and creativity. The company I work for is not a big company and there just aren’t enough resources to thouroughly document everything even if we wanted to. Even if we had the resources over documentation would be the beginning of a slow and painful death for the company.
The other end of the spectrum is complete seat of the pants no documentation. This is more often the case with a startup, there just isn’t time to write everything down. Why bother when you could be doing instead of writing about doing. This can lead to confusion and chaos though if there isn’t strong leadership to keep everyone moving in the same direction.
My theory on the matter I’ll call the “One Week Meeting Rule”. The requirement is that a group should be able to sit down and discuss plans for work and have everyone come to agreement. Then one week later sit down and test whether everyone still agrees on the same things. This isn’t about whether changes have been made and agreed to in the interim, but whether what everyone thought they had agreed to is still in sync after a set period of time. As people go about their business during the week and things are discussed, is everyone still moving in the same direction at the end of the week or is it more of a brownian motion effect?
The makeup of the group will have a very large impact in determining how much documentation is required. Without this simple little test it’s almost impossible for people to work together. If the basics can’t be agreed upon then documentation is required to cement what those agreements were and reinforce the decisions made.

Getting things moving

So, I was finally able to convince the little company that I work for that it would be advantageous to get everyone together in an office. Since moving into the office two weeks ago, the pace of development has increased tenfold. Having everyone with shouting distance makes all of the little questions that could bring development to a halt go away. It was the best move we could make at this time.
There has been a lot of discussion about the ability of information workers to be geographically dispersed. I think with the right setup that can work, but if there isn’t very strong management and seperation of responsibilities, then it can cause things to grind to a halt.
In the early stages of a startup when there is a whole lot of collaboration going on, you just can’t beat getting everyone in the same room.

Turning the Telcos Around

There’s an excellent article at TechDirt about the latest threats from the Telcos to charge for a tiered internet. You just have to ask, if you had the choice between two dsl or cable companies and one had Google and the other didn’t, which one would you choose? All it takes is a couple of the top tier internet companies to call the bluff and this whole thing falls apart.
Nobody wants this to turn into a shooting war, but the Telcos have to realize that what they have is of no value without the rest of the internet.