The visible Web is in represented by what search engine can potentially index.
Gulli and Signorini with a study made in 2005 (Gulli, A./Signorini, A. (2005). The Indexable Web is More than 11.5 billion pages.) gave an estimation of the indexable web by general search engines such as Google, Yahoo and Microsoft:
Estimation of the indexable web per search engine
Google seems to index 3 quarters of the indexable web but miss more than 30% of web pages indexed by others such as Yahoo and Microsoft.
Gulli and Signorini gave also an interesting estimation of the index those three search engines have in common which is estimated at less than 30%.
Here we clearly have once more the proof of the significance of using several search engines.
Moreover we have to consider that this study has been made on American technologies and then are not considering the language aspects. One study has been made on this topic in 2005 untitled "Search Engine Coverage Bias: Evidence and Possible Causes"(Vaughan, L./Thelwall, M. (2003). Search Engine Coverage Bias: Evidence and Possible Causes.) in order to discover if general search engines such as Google were covering American websites content in the same way as foreign websites.
The results of this study shows the supremacy of American websites presence:
Distribution of Public Web Sites By Country in 2002
(Online Computer Library Center. (2002). Trends in the evolution of the Public Web 1998-2002.)
In 2002 a large majority of websites were American. Because most of search engines are basing their algorithm on the number of links which point to a page American websites were far more covered than the foreigner ones. Moreover with time old American websites are keeping their leading position in the repartition of websites.
The study goes further by giving figures regarding the percentage of web sites covered by Google according to the different countries.
Percentage of Web Sites Covered by Google in 2002
The language here does not seem to be the problem because most of websites in Singapore are in English and are not covered by Google properly. But websites from Singapore may have not enough links which point to their pages as a result they are not covered as well as American ones.
Here we see the importance of using different (local) search engines for
some countries which know better a specific market.
The use here for some pure national services seem appropriate for two reasons:
- a better experience in indexing the websites of their country;
- giving less importance to American websites;
For this part the risk for search engine users is to consider that a single search engine can browse by himself the all web.
"Believing you can find everything and anything online is unrealistic" (Friedman, B. G. (2004). Web Search Savvy. P.20).
Some search engines are using different technology and each of them have gathered some experience in some particular fields that others did not. It is then important to take this into consideration when making research.
Aucun commentaire:
Enregistrer un commentaire