Site Monitoring

Search Engine History

This is the first in a set of pages describing all about Internet Search Engines.

In the early days of the Internet everything worked on the basis of trust. Web sites were not particularly commercialized and were written by technical people rather than the marketing department. In most cases it was possible to trust the web page developer to add the appropriate descriptive tags in the header of each web page. All the early search engines had to do was to read these META tags in the header, and build an index based on the keywords they found there.

The algorithm started off very simple - the search engine robot scanned each linked page on a web site, then it read the keywords in the header and finally added the site to the list of sites for each keyword or phrase mentioned. All the engine had to do was keep a database with a key on the keyphrase and the list of pages that this mapped to. When the user searches for a keyword it could just dump out the list of collected sites for that word.

As there were relatively few web sites the order of web sites in the results list didn't matter that much, the relevant ones would likely be in the first few anyway. So if you put in the correct keyword then you would get ranked 'randomly' according to the order that the page was visited by the search engine robot or some other such arbitrary reason. This meant there was an actual incentive to be quick to use a keyword as getting there first gave a good chance of being near the top of the list. Perhaps equally bad, some early engines just listed results alphabetically. Difficult to believe, but many download sites still use an alphabetic scheme rather than one that is based on merit.

The engine took the phrases extremely literally too, so if you accidentally put in a miss-spelling you would be listed for the mis-spelt version instead of the intended word. All this made search engine data collection quick and easy, it did not need to understand the structure of a web page, it just needed to read and analyze the short header part and ignored the rest of the page.

site monitoring

The main frustration with early search engine was that the sites were listed in any old order. It was quite possible to outperform the 'big name companies' by careful choice of keywords. In early days it proved possible to get ahead of Microsoft®'s web site for the keyword 'Microsoft' ! Many of the sites were also in a very poor state - incomplete, slow and having a number of unavailable pages and it was a frustrating business to find the site of real relevance. The other development at this time were the creation of commercial 'sex' sites. These misused the keywords in order to get listings for unrelated but common keywords so that they would get listed on many results pages they developed keyword lists could contain hundreds of words. So if you put in a keyphrase like 'Windows Update' you would find these sex sites high up in the list just because they had subverted the engine's trustful nature of just basing inclusion on the basis of keywords.

Click the link to learn more about how search engines work now.

Web site monitoring

For more detailed information on the details of search engine history please visit : History of Search Engines & Web History or A History of Search Engines .

Site Vigil is the product you need to simply and easily keep track of your web site on all the major search engines.