Had a brainfart moment the other day when working with cloudgrove.com and determining cache hit rates of objects. When you’re using Hibernate, the secondary cache is only employed when fetching objects by primary key. Otherwise it has to go back to the database to make sure that the selection criteria is still accurate. You can only guarantee an object and return from cache when you fetch it by unique identifier.
Keeping this in mind changes a few things about how I’m laying out the object structure. I had a score object that relates to both a content object and user object. Because Hibernate seriously discourages compound keys, I initially went with a primary key that wasn’t related to the other two items. However, the primary use case for the score object is to select the score by (contentId, userId) so I’m going to change the object to still have those two fields but to have a primary key that is a concatenation of the two fields. This way I can select the object through Hibernate using the secondary cache without having to do a select by criteria which always hits the database.
Over at flickr, they had a little downtime and decided to post some numbers about how impressive they are while they’re down.
- 12,000 photos a second at peak times
- 2,070,075 photos in 24 hours!
- 8.5 million registered members
- 10 million unique tags
Some pretty impressive numbers, but nothing too crazy compared to the several other large photo sharing sites. Was discussing what sets apart photo sharing sites with a few people the other day and being able to communicate with photos is definitely a draw. Flickr stumbled across an easier way to accomplish this with tags very early on. Other sites have experimented with ideas, but it’s a difficult question on how to get people to interact with photos without trying to replicate all of the features of a pure social networking site. To go down the social networking path dilutes your strength as a photo site, but tags allows for the interaction and expression while keeping the focus on the photo.
Also just goes to show that even large sites have downtime sometimes. As much money as you spend on hardware there are always going to be single points of failure that are not hot-swappable. When you’re running a small startup, focus on the product first, scalability second, fault tolerance third. When you’re building things up from nothing you just can’t devote that much energy to making things bulletproof. There just isn’t the payoff in the end. If the products not there it doesn’t matter that you have 5 9s of uptime.