A couple of weeks ago we added a few new stats to Publons reviewer profiles. One of the new metrics was the average impact factor of journals reviewed for, which caught people's attention and caused a bit of controversy:
Love using @Publons but dnt think adding IF info is useful. Cld be harmful if ppl judge quality of reviewer by IF of the jrnls they review 4— Tim Shakespeare (@tshakey) May 26, 2015
The feedback is always welcomed. Peer review metrics are new - the topic has received very little public discussion, and even less real-world experimentation. It is clear that many dislike the impact factor, which raised a couple of questions we'd love you all to weigh in on:
- Can the average journal impact factor of the journals reviewed for reveal anything interesting about a reviewer's contributions?
- Is it inappropriate to use any proxy for journal quality in peer review metrics, or is the issue specific to the impact factor?
What do reviewers look for in review metrics? If not this, then what?
This is a discussion that extends beyond Publons. As the idea of reviewer recognition and transparency grows, it is inevitable that metrics will emerge to summarise peer review contributions, in the same way metrics like the h-index have risen to relevance. One suggestion, made in a recent Royal Society Open Science article by Maurício Cantor and Shane Gero, is the "R-index", a single metric designed to quantify the contributions of reviewers that takes the journal impact factor, review word count, and editor evaluations as inputs.
Our approach has been to have a core metric (Publons merit), supported by a bevy of interesting and useful metrics and graphs. Publons merit has served our community well over the last year or so - each review receives a transparently-calculated merit score, and a reviewer's merit is the sum of all their review merit scores. It's a metric that encourages open science: a review that is journal-verified, published, and endorsed by a peer receives twice as much merit as its unpublished and non-endorsed equivalent.
The recently-released supporting metrics are not intended to replace Publons merit, nor to tell a particularly compelling story on their own; they are meant to work together to illustrate the reviewing characteristics of a researcher. We have stats for how often you review, how long your reviews are, how open your reviews are, and would like to provide a metric that speaks to journal quality in an easier-to-digest way than looking at the names of the ~25 journals that a reviewer has reviewed for.
The impact factor obviously has its issues, as many (including us) have written about. It is deceptively inconsistent across fields, is heavily misused, and is a pretty terrible proxy for the quality of an article. But, as the best-known proxy for journal quality, can it reveal anything interesting about a reviewer's contributions? Or is it just likely to influence harmful behaviour (like declining to review for low IF journals)?
In our office discussions pre-release we answered yes to the former, and no to the latter. But that is what feedback is for. We are certainly not wedded to keeping the average journal impact factor metric on Publons reviewer profiles. If reviewers don't like it, or if it's considered more harmful than it is useful, we will remove it.
One alternative is to use Google's h5-index (like the h-index for a journal, limited to the last five years). Would this be preferable, or is it subject to the same issues?
Finally, if not these metrics, what would you like to see in the way of peer review metrics?
Any and all feedback welcomed - as always. Feel free to comment here, add your thoughts to your own blog, or email us at firstname.lastname@example.org.