Internet Search Comes a Full Circle

Friday, 18 February 2005

Back in 1996, when I first used the internet, Yahoo! was the web. To get anywhere, I used to start at Yahoo’s front page. Yahoo! started out as a website containing links to other websites. Yahoo! classifed sites under the most important attributes of that page. This was, obviously, both human-time intensive and wouldn’t scale well. Other websites tried to copy this model and thus sprang many ‘portals’. The search result s of the day were no more than simple string matches. Most of the times, the results had nothing to do with the topic we were searching for.
The efficacy of the search engine was limited by these two factors:
1. Indexing of web pages by humans was not scalable to keep up with the explosion of new websites.
2. Search was brutal and used no human ‘intelligence’

Google came on to the scene with a simple intention of making search easy and useful. They solved both the above problems by
1. Index every possible page out there (They have one of the most awesome hardware setups in the world)
2. Use ‘human’ intelligence contained in the pages themselves to improve the search result.

So, what is this human intelligence we are talking about?
Every time we create a page, say “Kannada Plays” , we link to various other pages, which contain information related to the topic at hand. By doing this we are increasing the information contained in our wep page, as well affirming our faith in the referred page as being another source of useful information. Google uses this idea to “Rank” websites based on their popularity.

Google’s approach to search in-a-way helped make the internet a better place. At least till the ‘90%’ caught on. Google liked well structured documents, well linked documents, clean URL’s and those pages which tried to be standard compliant. A lot of Search Engine Optimisation(SEO) Companies sprang up to help websites to gain visibility on google . And they made a lot of money on it too. Where ever there is money, greed soon follows. That is manifesting today in form of blog spam, trackback spam and of late 302 page hijack scams. Google is very competitive and the search game is gettting hotter every day with yahoo! and MSN upping their ante. Search is serious business today with billions of dollars in revenue for both search companies and the advertisers. In the internet age, losing visibility is death. So, people will do anything to increase their presence (pagerank?). Where does this leave the ordinary user vis-a-vis finding genuine information? Is there a way out of this seemingly endless cat and mouse game? is there a better way to find useful information other than to use google/yahoo/msn?

Yes, and that is del.icio.us . This site is a masterful execution of one simple idea; the idea that – people like to bookmark what they like, and if there is a way to know what other people are reading on the internet on the same subject, we are increasing the ‘intelligence’ of the system as a whole. There are two basic ways you can use del.icio.us.

1. Use it to store information on useful links, by assigning them ‘tags’. Basically this would allow you to recall any pages by just typing http://del.icio.us/tag/subject . This eliminates any need for you to store the web pages on your hard disk and there is no worry about having to writing down the URL for future reference. This is called ‘tag surfing’

2. Look up people with similar interest. If you notice that 10 other people have bookmarked the same page on an esoteric subject as you, you might want to take a look at what they are reading. Its akin to reading the ‘mind’ without even having to know them! [I agree, that is a little paranoid :)]

Today, when I want to learn more about a particular topic, I still use google. But, the aha! finds are usually from del.icio.us. Does it mean delicious is the answer to our search woes? It seems so for forseeable amount of time. Long run? well.. as John Keynes put it ,”In the long run, we are all dead”. No, the popularity and success of del.icio.us will see ‘link spam’ become a problem. But, again, the solution lies with us and not the system.

In real world, we trust information from people we know. In future, when link spam becomes a serious issue, we may limit our tag surfing to our friends (real and virtual) and their friends. The concept of friends of friends will keep some areas of the ‘semantic web’ hygenic and useful.

The human intelligence has made a come back into the search game. Interestingly, the technology behind del.icio.us uses the age old social-networking idea to find things faster. Its all in the ‘connections’ you know.

Article from circa 2005-06. Exact date lost