Proceedings of the AoIR-ASIST 2004 Workshop on Web Science Research Methods























Can scientific collaboration and excellence be measured by Web presence and Web links? [presentation]

Judit Bar-Ilan, The Hebrew University of Jerusalem and Bar-Ilan University

As soon as the Web appeared as an emerging information and communication medium, scientometricians became aware of the new opportunities it presented. Early works demonstrated the applicability of traditional bibliometric techniques to the Web, where documents were viewed as publications, hyperlinks as citations and sites or domains as publication sources (e.g. Larson, 1996; Rousseau, 1997; Ingwersen, 1998). Soon after these early works appeared, it was noticed that the results depend heavily on the coverage of the search engine or search engines used for these calculations (Smith, 1999; Thelwall, 2000). To overcome this problem, Thelwall (2001a, 2001b) decided to use his own crawler instead of relying on commercial search engines in order to study the interlinkage between UK universities. For larger scale studies measuring the visibility of a given site or publication based on non-academic and non-local links as well, we still have to rely on the results of the commercial search engines.

There are several problems related to the use of webometric measurements for the assessment of scientific collaboration and excellence:

  • hypertext links are not completely analogous with traditional citations (Egghe, 2000) and “self-sitations” are not similar to self-citations
  • the stability of the currently existing commercial search tools is not adequate (Mettrop & Nieuwenhuysen, 2001; Bar-Ilan, 2000, 2002)
  • the Web is an extremely dynamic medium (Koehler, 1999, 2002; Bar-Ilan & Peritz, 1999, to appear; Fetterly et al, 2003), thus measurements are heavily influenced by the exact date they are carried out
  • what is the influence of the commercialization of search and crawling (pay-for-placement, pay-for-inclusion)
  • it is not clear whether inlinks to academic institutions measure scientific excellence or reputation (Thelwall, 2001c, Bar-Ilan, 2004)

Still, the Web cannot be ignored as a scientific information source, thus we must continue to use and develop Webometric indicators while taking into account the limitations of the medium and of the tools we use.


Bar-Ilan, J. (2000). Evaluating the stability of the search tools Hotbot and Snap: A case study. Online Information Review, 24(6): 430-450.

Bar-Ilan, J. (2002). Methods for Measuring Search Engine Performance over Time.

Journal of the American Society for Information Science and Technology , 54(3): 308-319.

Bar-Ilan, J. (2004). A microscopic link analysis of academic institutions within a country - the case of Israel. Scientometrics, 59(3), 391-403

Bar-Ilan, J.and Peritz B. C. (1999). The life span of a specific topic on the Web; the case of 'Informetrics': A quantitative analysis, Scientometrics, vol. 46, no 3, 371-382.

Bar-Ilan, J. and Peritz, B. C. (to appear). Evolution, continuity and disappearance of documents on a specific topic on the Web - A longitudinal study of 'informetrics'. Journal of the American Society for Information Science and Technology

Egghe, L. (2000). New informetric aspects of the Internet: Some reflections - many problems. Journal of information Science, 26(5), 329-335.

Fetterly, D., Manasse, M., Najork, M., & Wiener, J. (2003). A large-scale study of the evolution of Web pages. In Proceedings of the 12th International World Wide Web Conference, May 2003, 669-678. Retrieved June 23, 2003, from http://www2003.org/cdrom/papers/refereed/p097/P97%20sources/p97-fetterly.html

Ingwersen. P. (1998). The calculation of Web Impact Factors. Journal of Documentation, 54(2), 236-243.

Koehler, W. (1999). An analysis of Web page and Web site constancy and permanence. Journal of the American Society for Information Science, 50(2), 162-180.

Koehler, W. (2002). Web page change and persistence - A four-year longitudinal study. Journal of the American Society for Information Science and Technology, 53(2), 162-171.

Larson, R. R. (1996). Bibliometrics of the World Wide Web: An exploratory analysis of the intellectual structure of Cyberspace. ASIS’96. Retrieved May 7, 2004 from http://sherlock.berkeley.edu/asis96/asis96.html

Mettrop, W., & Nieuwenhuysen, P. (2001). Internet search engines - fluctuations in document accessibility. Journal of Documentation, 57(5), 623-651

Rousseau, R. (1997). Sitations: an exploratory study. Cybermetrics, 1(1). Retrieved May 7, 2004 from http://www.cindoc.csic.es/cybermetrics/articles/v1i1p1.html

Smith, A. C. (1999). A tale of two Web spaces: comparing sites using Web impact factors. Journal of Documentation, 55(5): 577-592.

Thelwall, M. (2000). Web impact factors and search engine coverage. Journal of Documentation, 56(2): 185-189.

Thelwall, M. (2001a). Results from a Web impact factor crawler. Journal of Documentation, 57(2), 177-191.

Thelwall, M. (2001b). A web crawler design for data mining. Journal of Information Science, 27(5), 319-325.

Thelwall, M. (2001c). Extracting macroscopic information from Web links. Journal of the American Society for Information Science and Technology, 52(13): 1157-1168.