Indicators

The CWTS Leiden Ranking 2019 offers a sophisticated set of bibliometric indicators that provide statistics at the level of universities on scientific impact, collaboration, open access publishing, and gender diversity. The indicators available in the Leiden Ranking are discussed in detail below.

Publications

The Leiden Ranking is based on publications in the Web of Science database produced by Clarivate Analytics. The most up-to-date statistics made available in the Leiden Ranking are based on publications in the period 2014–2017, but statistics are also provided for earlier periods. Web of Science includes a number of citation indices. The Leiden Ranking uses the Science Citation Index Expanded, the Social Sciences Citation Index, and the Arts & Humanities Citation Index. Only publications of the Web of Science document types article and review are taken into account. The Leiden Ranking does not consider book publications, publications in conference proceedings, and publications in journals not indexed in the above-mentioned citation indices of Web of Science.

The Leiden Ranking takes into account only a subset of the publications in the Science Citation Index Expanded, the Social Sciences Citation Index, and the Arts & Humanities Citation Index. We refer to the publications in this subset as core publications. Core publications are publications in international scientific journals in fields that are suitable for citation analysis. In order to be classified as a core publication, a publication must satisfy the following criteria:

  • The publication has been written in English.
  • The publication has one or more authors. (Anonymous publications are not allowed.)
  • The publication has not been retracted.
  • The publication has appeared in a core journal.

The last criterion is a very important one. In the Leiden Ranking, a journal is considered a core journal if it meets the following conditions:

  • The journal has an international scope, as reflected by the countries in which researchers publishing in the journal and citing to the journal are located.
  • The journal has a sufficiently large number of references to other core journals, indicating that the journal is situated in a field that is suitable for citation analysis. Many journals in the arts and humanities do not meet this condition. The same applies to trade journals and popular magazines.

In the calculation of the Leiden Ranking indicators, only core publications are taken into account. Excluding non-core publications ensures that the Leiden Ranking is based on a relatively homogeneous set of publications, namely publications in international scientific journals in fields that are suitable for citation analysis. The use of such a relatively homogeneous set of publications enhances the international comparability of universities. It should be emphasized that non-core publications are excluded not because they are considered less important than core publications. Non-core publications may have an important scientific value. About one-sixth of the publications in Web of Science are excluded because they have been classified as non-core publications.

Our concept of core publications should not be confused with the Web of Science Core Collection. The Web of Science Core Collection represents a subset of the citation indices available in Web of Science. As explained above, the core publications on which the Leiden Ranking is based represent a subset of the publications in the Science Citation Index Expanded, the Social Sciences Citation Index, and the Arts & Humanities Citation Index.

A list of core and non-core journals is available in this Excel file.

Size-dependent vs. size-independent indicators

Indicators included in the Leiden Ranking have two variants: A size-dependent and a size-independent variant. In general, size-dependent indicators are obtained by counting the absolute number of publications of a university that have a certain property, while size-independent indicators are obtained by calculating the proportion of the publications of a university with a certain property. For instance, the number of highly cited publications of a university and the number of publications of a university co-authored with other organizations are size-dependent indicators. The proportion of the publications of a university that are highly cited and the proportion of a university’s publications co-authored with other organizations are size-independent indicators. In the case of size-dependent indicators, universities with a larger publication output tend to perform better than universities with a smaller publication output. Size-independent indicators have been corrected for the size of the publication output of a university. Hence, when size-independent indicators are used, both larger and smaller universities may perform well.

Scientific impact indicators

The Leiden Ranking provides the following indicators of scientific impact:

  • P. Total number of publications of a university.
  • P(top 1%) and PP(top 1%). The number and the proportion of a university’s publications that, compared with other publications in the same field and in the same year, belong to the top 1% most frequently cited.
  • P(top 5%) and PP(top 5%). The number and the proportion of a university’s publications that, compared with other publications in the same field and in the same year, belong to the top 5% most frequently cited.
  • P(top 10%) and PP(top 10%). The number and the proportion of a university’s publications that, compared with other publications in the same field and in the same year, belong to the top 10% most frequently cited.
  • P(top 50%) and PP(top 50%). The number and the proportion of a university’s publications that, compared with other publications in the same field and in the same year, belong to the top 50% most frequently cited.
  • TCS and MCS. The total and the average number of citations of the publications of a university.
  • TNCS and MNCS. The total and the average number of citations of the publications of a university, normalized for field and publication year. An MNCS value of two for instance means that the publications of a university have been cited twice above the average of their field and publication year.

Citations are counted until the end of 2018 in the calculation of the above indicators. Author self–citations are excluded. All indicators except for TCS and MCS are normalized for differences in citation patterns between scientific fields. For the purpose of this field normalization, about 4500 fields are distinguished. These fields are defined at the level of individual publications. Using a computer algorithm, each publication in Web of Science is assigned to a field based on its citation relations with other publications.

The TCS, MCS, TNCS, and MNCS indicators are not available on the main ranking page. These indicators can be accessed by clicking on the name of a university. An overview of all bibliometric statistics available for the university will then be presented. This overview also includes the TCS, MCS, TNCS, and MNCS indicators.

Collaboration indicators

The Leiden Ranking provides the following indicators of collaboration:

  • P. Total number of publications of a university.
  • P(collab) and PP(collab). The number and the proportion of a university’s publications that have been co-authored with one or more other organizations.
  • P(int collab) and PP(int collab). The number and the proportion of a university’s publications that have been co-authored by two or more countries.
  • P(industry) and PP(industry). The number and the proportion of a university’s publications that have been co-authored with one or more industrial organizations. All private sector for profit business enterprises, covering all manufacturing and services sectors, are regarded as industrial organizations. This includes research institutes and other corporate R&D laboratories that are fully funded or owned by for profit business enterprises. Organizations in the private education sector and private medical/health sector (including hospitals and clinics) are not classified as industrial organizations.
  • P(<100 km) and PP(<100 km). The number and the proportion of a university’s publications with a geographical collaboration distance of less than 100 km. The geographical collaboration distance of a publication equals the largest geographical distance between two addresses mentioned in the publication’s address list.
  • P(>5000 km) and PP(>5000 km). The number and the proportion of a university’s publications with a geographical collaboration distance of more than 5000 km.

Some limitations of the above indicators need to be mentioned. In the case of the P(industry) and PP(industry) indicators, we have made an effort to identify industrial organizations as accurately as possible. Inevitably, however, there will be inaccuracies and omissions in the identification of industrial organizations. In the case of the P(<100 km), pp(<100 km), P(>5000 km), and PP(>5000 km) indicators, we rely on geocoding of addresses listed in Web of Science. There may be some inaccuracies in the geocoding that we have performed, and for addresses that are used infrequently no geocodes may be available. In general, we e xpect these inaccuracies and omissions to have only a small effect on the indicators.

Open access indicators

The Leiden Ranking provides the following indicators of open access publishing:

  • P. Total number of publications of a university.
  • P(OA) and PP(OA). The number and the proportion of open access publications of a university.
  • P(gold OA) and PP(gold OA). The number and the proportion of gold open access publications of a university. Gold open access publications are publications in an open access journal.
  • P(hybrid OA) and PP(hybrid OA). The number and the proportion of hybrid open access publications of a university. Hybrid open access publications are publications in a subscription journal that are open access.
  • P(bronze OA) and PP(bronze OA). The number and the proportion of bronze open access publications of a university. Bronze open access publications are publications in a journal that are open access without a license.
  • P(green OA) and PP(green OA). The number and the proportion of green open access publications of a university. Green open access publications are publications in a journal that are also available in an open access repository.
  • P(OA unknown) and PP(OA unknown). The number and the proportion of a university’s publications for which the open access status is unknown. These publications typically do not have a DOI in the Web of Science database.

The different types of open access are partially overlapping. A publication can be both green open access and gold, hybrid, or bronze open access. In the calculation of the P(OA) and PP(OA) indicators, a publication is considered open access if it is green open access and/or gold, hybrid, or bronze open access.

The open access status of a publication is determined based on Unpaywall data. More detailed information, including a discussion of the choices made in defining the different types of open access, can be found in this blog post.

Gender indicators

The Leiden Ranking provides the following indicators of gender diversity:

  • A. The total number of authorships of a university. Consider for instance a publication that has five authors, of which three report university X as their affiliation and two report university Y as their affiliation. This publication then yields three authorships for university X and two authorships for university Y.
  • A(MF). The number of male and female authorships of a university, that is, a university’s number of authorships for which the gender is known.
  • A(unknown) and PA(unknown). The number of authorships of a university for which the gender is unknown and the number of authorships for which the gender is unknown as a proportion of a university’s total number of authorships.
  • A(M), PA(M), and PA(M|MF). The number of male authorships of a university, the number of male authorships as a proportion of a university’s total number of authorships, and the number of male authorships as a proportion of a university’s number of male and female authorships.
  • A(F), PA(F), and PA(F|MF). The number of female authorships of a university, the number of female authorships as a proportion of a university’s total number of authorships, and the number of female authorships as a proportion of a university’s number of male and female authorships.

For each authorship of a university, the gender is determined using the following four-step procedure:

  1. Author disambiguation. Using an author disambiguation algorithm developed by CWTS, authorships are linked to authors. If there is sufficient evidence to assume that different publications have been authored by the same individual, the algorithm links the corresponding authorships to the same author.
  2. Author-country linking. Each author is linked to one or more countries. If the country of the author’s first publication is the same as the country occurring most often in the author’s publications, the author is linked to this country. Otherwise, the author is linked to all countries occurring in his or her publications.
  3. Retrieval of gender statistics. For each author, gender statistics are collected from three sources: Gender API, Genderize.io, and Gender Guesser. Gender statistics are obtained based on the first name of an author and the countries to which the author is linked.
  4. Gender assignment. For each author, a gender (male or female) is assigned if Gender API is able to determine the gender with a reported accuracy of at least 90%. If Gender API does not recognize the first name of an author, Gender Guesser and Genderize.io are used. If none of these sources is able to determine the gender of an author with sufficient accuracy, the gender is considered unknown.

Using the above procedure, the gender can be determined for about 70% of all authorships of universities included in the Leiden Ranking. For the remaining authorships, the gender is unknown.

Counting method

The scientific impact indicators in the Leiden Ranking can be calculated using either a full counting or a fractional counting method. The full counting method gives a full weight of one to each publication of a university. The fractional counting method gives less weight to collaborative publications than to non-collaborative ones. For instance, if a publication has been co-authored by five researchers and two of these researchers are affiliated with a particular university, the publication has a weight of 2 / 5 = 0.4 in the calculation of the scientific impact indicators for this university. The fractional counting method leads to a more proper field normalization of scientific impact indicators and therefore to fairer comparisons between universities active in different fields. For this reason, fractional counting is the preferred counting method for the scientific impact indicators in the Leiden Ranking. Collaboration, open access, and gender indicators are always calculated using the full counting method.

Trend analysis

To facilitate trend analyses, the Leiden Ranking provides statistics not only based on publications from the period 2014–2017, but also based on publications from earlier periods: 2006–2009, 2007–2010, ..., 2013–2016. The statistics for the different periods are calculated in a fully consistent way. For each period, citations are counted until the end of the first year after the period has ended. For instance, in the case of the period 2006–2009 citations are counted until the end of 2010, while in the case of the period 2014–2017 citations are counted until the end of 2018.

Stability intervals

Stability intervals provide some insight into the uncertainty in bibliometric statistics. A stability interval indicates a range of values of an indicator that are likely to be observed when the underlying set of publications changes. For instance, the PP(top 10%) indicator may be equal to 15.3% for a particular university, with a stability interval ranging from 14.1% to 16.5%. This means that the PP(top 10%) indicator equals 15.3% for this university, but that changes in the set of publications of the university may relatively easily lead to PP(top 10%) values in the range from 14.1% to 16.5%. The Leiden Ranking employs 95% stability intervals constructed using a statistical technique known as bootstrapping.

More information

More information on the indicators available in the Leiden Ranking can be found in a number of papers published by CWTS researchers. A detailed discussion of the Leiden Ranking is presented by Waltman et al. (2012). This paper relates to the 2011/2012 edition of the Leiden Ranking. Although the paper is not up-to-date anymore, it still provides relevant information on the Leiden Ranking. Field normalization of scientific impact indicators based on algorithmically defined fields is studied by Ruiz-Castillo and Waltman (2014). The methodology adopted in the Leiden Ranking for identifying core publications and core journals is outlined by Waltman and Van Eck (2013a, 2013b). Finally, the importance of using fractional rather than full counting in the calculation of field-normalized scientific impact indicators is explained by Waltman and Van Eck (2015).

  • Waltman, L., Calero-Medina, C., Kosten, J., Noyons, E.C.M., Tijssen, R.J.W., Van Eck, N.J., Van Leeuwen, T.N., Van Raan, A.F.J., Visser, M.S., & Wouters, P. (2012). The Leiden Ranking 2011/2012: Data collection, indicators, and interpretation. Journal of the American Society for Information Science and Technology, 63(12), 2419–2432. (paper, preprint)
  • Waltman, L., & Van Eck, N.J. (2013a). Source normalized indicators of citation impact: An overview of different approaches and an empirical comparison. Scientometrics, 96(3), 699–716. (paper, preprint)
  • Waltman, L., & Van Eck, N.J. (2013b). A systematic empirical comparison of different approaches for normalizing citation impact indicators. Journal of Informetrics, 7(4), 833–849. (paper, preprint)
  • Ruiz-Castillo, J., & Waltman, L. (2015). Field-normalized citation impact indicators using algorithmically constructed classification systems of science. Journal of Informetrics, 9(1), 102–117. (paper)
  • Waltman, L., & Van Eck, N.J. (2015). Field-normalized citation impact indicators and the choice of an appropriate counting method. Journal of Informetrics, 9(4), 872–894. (paper, preprint)