Google steps into overdrive with Caffeine

Earlier this week Google announced the roll-out of its new search infrastructure named “Caffeine”. Google claim that it provides “50 percent fresher results” than it’s previous index.

A long time in the coming Google rolled out a public test in August last year against one of their data centers finishing it’s testing in November 2009. By all accounts it is expected that it has been rolled out worldwide to all data centers.

Much like any search engine, when performing a search on Google you’re not actually performing a search of the live web but a snapshot of the internet that known as the index. A search index is “like the list in the back of a book, helps you pinpoint exactly the information you need.”

The previous search index was broken into a number of layers with some layers with it’s main layer taking in the region of two weeks to refresh. Analysing the entire stored snapshot of the web (index) before making those pages available to be searched against .

Due to the ever increasing size of the internet and growth of content including images, videos and real-time updates (such as micro blogging) Google decided to improve the infrastructure to meet the demand of more current and relevant search results.

As opposed to it’s previous method of analysing the entire web before refreshing it’s top layer, Caffeine continuously updates it’s index analysing smaller portions of the web meaning that we can “find fresher information than ever before — no matter when or where it was published” .