POSTPRINT authorea.com/108

# A measure of total research impact independent of time and discipline

This article was published as A measure of total research impact independent of time and discipline. Alberto Pepe, Michael J. Kurtz. PLoS One. 7(11): e46428. doi:10.1371/journal.pone.00464282012. (Open Access Article)

Abstract. Authorship and citation practices evolve with time and differ by academic discipline. As such, indicators of research productivity based on citation records are naturally subject to historical and disciplinary effects. We observe these effects on a corpus of astronomer career data constructed from a database of refereed publications. We employ a simple mechanism to measure research output using author and reference counts available in bibliographic databases to develop a citation-based indicator of research productivity. The total research impact (tori) quantifies, for an individual, the total amount of scholarly work that others have devoted to his/her work, measured in the volume of research papers. A derived measure, the research impact quotient (riq), is an age independent measure of an individual’s research ability. We demonstrate that these measures are substantially less vulnerable to temporal debasement and cross-disciplinary bias than the most popular current measures. The proposed measures of research impact, tori and riq, have been implemented in the Smithsonian/NASA Astrophysics Data System.

# Introduction

Measuring the research performance of scholars plays a critical role in the allocation of scholarly resources at all levels “quantitative” means of measurement has long been through the use of citations (Garfield 1955, de Solla Price 1965). Citations are routinely used to evaluate the research productivity of individuals citations to measure research performance involves several confounding factors which tend to become more important as the degree of aggregation decreases. For the evaluation of individuals, important challenges are:

Discipline

Citation practices vary widely among various fields. Citation rates can vary between disciplines by an order of magnitude (Leydesdorff 2011); among sub-disciplines in the same discipline they can vary by a factor of two.

Co-Authorship

A paper can have an arbitrary number of authors, from one to several thousand. Should an author of a single authored paper receive the same credit for a citation as someone who has co-authors?

Age

The number of citations accrued by an individual scales with the square of his/her career length (Hirsch 2005, Kurtz 2005); thus, a person with a career length of 10 years will have half the citations of an equal person with a career length of 14.14 years. This age effect problem is exacerbated by the fact that the two aforementioned challenges are time dependent. For example, in the field of astrophysics, both the mean number of references and the mean number of authors have approximately doubled in the last 20 years. (Henneken 2011, Schulman 1997)

Some of the lesser challenges associated with using citations to measure research productivity of individuals are:

Self-Citation

If an author cites papers by him/herself should they count as much as citations from papers by others?

Curation

In addition to having a database of articles and citations, one must clean and curate its data. For example, an analysis of an individual’s productivity requires that one be able to exactly identify the articles written by that individual. Name changes (e.g., due to marriage) and homonyms (name clashes, where different people have the same name) can make this a serious problem.

Shot Noise

Sometimes an individual can, almost entirely by chance, become an author of one or more very highly cited papers, perhaps as a student. The citation distribution is a Zipf like power law, whereby some articles are cited thousands of times more than the median; clearly, there can be circumstances where a direct count of citations is not a fair representation of impact.

In a highly influential paper, Hirsch (Hirsch 2005) proposed a pair of citation-based measures (h, m) which: solve the shot-noise problem, substantially improve the age problem, and help with the curation difficulty, discussed above. The Hirsch index, h, is the position in a citation ranked list where the rank equals the number of citations; absent shot noise h is obviously proportional to the square root of the total number of citations, which grows linearly with career length (Hirsch 2005, Kurtz 2005). The m quotient is h divided by career length, and is a constant throughout the career of an individual with constant productivity in a constant environment.

The h-index is by far the most widely used indicator of personal scientific productivity. As such, it has been greatly reviewed and criticized in specialized literature and innumerable alternatives have been proposed (for a review Egghe 2010). Some notable substitutes of the h-index include: the mean number of citations per paper (Lehmann 2006), the e-index which complements the h-index for excess citations (Zhang 2009), the g-index, similar to h, but differs for it accounts for the averaged citation count an author has accrued (Egghe 2006), and the highly cited publications indicator (Waltman 2011). Two normalizations of the h-index which have been proposed in the literature with promising results are by the number of article co-authors (Batista 2006), and by the average number of citations per article per discipline (Radicchi 2008). The measures proposed in this article use both of these normalizations, combined.

While the h-index is a valuable, simple, and effective indicator of scholarly performance, we find that it is inadequate for cross-disciplinary and historical comparisons of individuals. Comparing two scholars from different disciplines or from different time periods, or with differing co-authorship practices, based on their h-index would very likely yield erroneous results, simply because citation and authorship practices have changed (and constantly change) across disciplines and through time.

# Methods

To investigate the historical and disciplinary effects of the h-index, we calculate individual researcher performance on a virtually complete astronomy database of 814,505 refereed publications extracted from the Smithsonian/NASA Astrophysics Data System (http://adsabs.harvard.edu/) (Kurtz 2005). We focus on the careers of 11,036 astronomers with non ambiguous names, with a publication record of over 20 refereed articles and a career span of over 10 years, who are either currently active or have a career length of at least 30 years which started on or after 1950. We define the beginning of the career as the year of publication of an astronomer’s first refereed article. To begin, we compute the m-quotient on this cohort of astronomers and demonstrate that it is not constant over time and across sub-disciplines of astronomy. Then, we propose a novel measure of research performance, the research impact quotient (riq). We compute riq on the same bibliographic corpus showing that this derived measure eliminates most historical and disciplinary bias.

# Results

## Temporal debasement and cross-disciplinary bias of current measures

In Figure 1, we illustrate the temporal debasement of the m-quotient, defined as $$h/y$$, where $$y$$ is the number of years since a scholar’s first publication. Astronomers who began their career in the 1950’s have systematically lower ms than those who started their career later on. The red line in Figure 1 is an exponential best-fit regression line with slope $$b=0.0314$$ and a $$0.95$$ confidence interval band. Year means are plotted as filled black circles. In 50 years, the average m-quotient has increased from $$0.28$$ to $$1.62$$, with an increase rate of $$3.1$$% per year, and well above the global mean of $$x = 0.855$$. (We also run an identical regression analysis on a cohort of 697 astronomers for whom we have access to both publication record and Ph.D. dissertation. Using the doctoral graduation year as the starting point of their career we find similar effects of temporal debasement — best-fit regression line has slope $$b=0.027$$.)

In Figure 2 we show cross-disciplinary bias of the astronomers working in different fields of specialization is displayed as a box-and-whisker plot. Astronomers’ fields of specialization are computed by simply selecting the single most recurrent keyword used by authors in their published articles. In order to isolate disciplinary effects, we only analyze a subset of the corpus which includes $$1601$$ astronomers who started their career in the 1990s and who publish in popular sub-disciplines in this time window (fields with 30 authors or less are excluded from this analysis). The dashed line in Figure 2 shows the global mean m-quotient for all authors in the corpus ($$x=0.855$$). We find that for only a small portion of sub-disciplines (6 out of 23) does the global mean m-quotient fall within the discipline-specific upper or lower quartiles (“atmosphere” through “planets and satellites”). Astronomers who publish in all the other fields have systematically higher m-quotients than the global average, as evinced by higher median m-quotients for fields “gravitational waves” to “galaxies evolution”.

Differences so large across time and disciplines ma