Citation Percentiles and “Highly Cited” labels

Can apply to: Journal articles

Metric definition: The position of a paper or group of papers with respect to other papers in a given discipline, country, and/or time period, based on the number of citations they have received. Expressed as a percentile or awarded a “Highly Cited” honor based upon percentile rankings.

Metric calculation: Citation percentiles are defined as “[a] percentile-based bibliometric indicator is an indicator that values publications based on their position within the citation distribution of their field. The most straightforward percentile-based indicator is the proportion of frequently cited publications, for instance the proportion of publications that belong to the top 10% most frequently cited of their field.”

Note that depending upon the data source, percentiles can be expressed differently, in that an “inverted percentile” (e.g. 1%) and regular percentile (e.g. 99th percentile) are interpreted to mean the same thing.

Clarivate’s InCites platform has been credited for classifying percentiles in a particularly clear and specific manner (i.e. by showing not only the top 1% of papers, but also those in the top 0.1% and 0.01%), giving “a better indication of skew and ranking among the top-cited publications”.

SciVal provides a slightly different reference set than other providers: percentiles for each year are calculated against the entire Scopus database by default, and different comparison axes (called “data universes”), such as publication type, can be used for further comparison. In SciVal, “Outputs in Top Percentiles can only be calculated for the current year from the first data snapshot on or after 1 July. It will be displayed as a null value until this date is reached”.

Data sources: Depending on the metric source, percentiles are calculated using citation data from Web of Science or Scopus.

Appropriate use cases: Percentiles based jointly upon subject area, document type (i.e. research or review articles), and year of publication are described as being the most appropriate means of comparison between journal articles or groups of journal articles. Essentially they follow the same logic as any other field-normalized citation indicator, with the advantage that percentile-based citation indicators are less prone to be influenced by outliers.

Limitations: Percentile-based indicators are based on citations, so they inherit all of citations’ limitations, including that a number of factors that can influence the citation rates for publications within subject areas and year of publication, including gender, the number of co-authors, the nationalities of authors, and outliers (very high or low cited publications). As such, these percentiles should be interpreted with care.

Though in theory simple to calculate, there are a number of possible variations in formulae used to calculate percentiles; as such, percentiles are only directly comparable when the same calculation is used for each.

Field-normalization has its own inherent limitations that impact the use of percentiles. The accuracy of field classification will affect the usefulness of a field’s percentiles, as will cases where a single paper may be assigned to several different fields (each of which can have highly variable median citation counts).

Inappropriate use cases: In general, percentiles based upon very small comparison sets are ill-advised, as just one or two additional publication may substantially increase or decrease the share of highly cited publications.

Percentiles should not be used to make judgements about papers using reference sets that compare papers with other papers published in the same journal. Comparisons of groups of papers (e.g. for an individual’s body of work) should only be made given a sufficiently large corpus, as in the case of senior researchers. When evaluating heterogenous or interdisciplinary groups of papers, it’s possible that important subtopics with low citation densities may be left out.

Available metric sources: Essential Science Indicators, InCites, SciVal

Transparency: The specific formulae and source citation data used to calculate percentiles can vary between providers and are not always publicly documented.

Website: n/a

Timeframe: Differs between providers

Further reading:

Bornmann L., Leydesdorff L., and Mutz R. (2013). The Use of Percentiles and Percentile Rank Classes in the Analysis of Bibliometric Data: Opportunities and Limits. Journal of Informetrics, 7(1),158–65. https://doi.org/10.1016/j.joi.2012.10.001

Bornmann, L., & Marx, W. (2013). How good is research really? EMBO Reports, 14(3), 226–230. https://doi.org/10.1038/embor.2013.9

Category: