Researcher assessment criteria are changing globally

Republish

The Agency FAPESP licenses news via Creative Commons (CC-BY-NC-ND) so that they can be republished free of charge and in a simple way by other digital or printed vehicles. Agência FAPESP must be credited as the source of the content being republished and the name of the reporter (if any) must be attributed. Using the HMTL button below allows compliance with these rules, detailed in Digital Republishing Policy FAPESP.

Agência FAPESP* –By Karina Toledo  |  Agência FAPESP – Although excellence is one of the most highly valued qualities in academia, it is hard to define. Indicators and metrics based on numbers of articles and citations emerged as a tool to facilitate this process in the mid-twentieth century, acquiring significance in the 1980s in defining relevant research that merits funding.

The first international university rankings appeared at the start of this century, in a movement led by Shanghai Jiao Tong University, China, which in 2003 presented the Academic Ranking of World Universities (ARWU). European institutions took steps to catch up, and the following year saw the launch in the UK of the Times Higher Education Supplement World University Rankings, now known as Times Higher Education (THE), soon joined by QS World University Rankings, compiled by global higher education analyst Quacquarelli Symonds. These two pioneers are currently among the most influential rankings.

The indicators and methodology used to evaluate the “world’s best universities” vary according to the purpose and origin of each ranking, explains Jacques Marcovitch, a professor at the University of São Paulo (USP) who coordinates the Metricas.edu Project. With FAPESP’s support, the team led by Marcovitch monitors and analyzes national and international comparisons of universities. Their aims include developing strategies to strengthen the performance of institutions in São Paulo – already the Brazilian leaders – and of other universities that participate in international comparisons.

“We can say that there are three major categories: commercial rankings [e.g. THE, QS and newspaper Folha de São Paulo’s RUF], generally intended for families who want to choose a university for their children; national rankings [e.g. ARWU and SCImago Institutions Rankings], constructed in accordance with the higher education missions and priorities of the country concerned; and academic rankings [e.g. U-Multirank], which are not intended to classify universities but to compare them in a sort of benchmarking exercise,” Marcovitch explains. He has written at length about the subject in a contribution to Repensar a Universidade. Desempenho Acadêmico e Comparações Internacionais, a collective volume available from USP’s Open Book Portal.

In the wake of university rankings, the world saw the appearance of platforms that claim to measure the individual performance of researchers, such as AD Scientific Index, Research.com or Highly Cited Researchers; the latter is compiled by a UK company called Clarivate Analytics. While they use different methodologies, time divisions and databases, these individual rankings are all strongly based on quantitative indicators, such as the number of published articles (productivity), total citations or article citations (impact), and the h-index or derivatives (productivity and impact combined).

The accuracy of this kind of approach began to be questioned more sharply in the last decade, and there were moves to make research assessment broader, more inclusive, and better aligned with local and global needs. An example is the Declaration on Research Assessment (DORA), developed during the December 2012 Annual Meeting of the American Society for Cell Biology in San Francisco, published in 2013 and signed since then by 23,993 individuals and organizations in 164 countries.

The San Francisco Declaration, which was ten years old on May 16, 2023, recommends elimination of the use of journal-based metrics such as journal impact factors as a surrogate measure of the quality of individual research articles, or to assess an individual scientist’s contributions, or in hiring, promotion or funding decisions.

In September 2014, researchers attending the 19th International Conference on Science and Technology Indicators in Leiden (Netherlands) published the Leiden Manifesto on Research Metrics, which proposes ten principles for the measurement of research performance. The first states that “Quantitative evaluation should support qualitative, expert assessment”, and “assessors must not be tempted to cede decision-making to the numbers. Indicators must not substitute for informed judgment”. Another principle worth highlighting is the need to “Protect excellence in locally relevant research”, and to give due weight to "high-quality non-English literature”, especially where locally relevant research does not feature among the “hot topics” for high-impact journals. Examples in the Brazilian case include neglected diseases.

In Europe, the Agreement on Reforming Research Assessment, drafted by Science Europe, the European University Association (EUA) and Dr Karen Stroobants, supported by the European Commission, was published in July 2022. Resulting from a co-creation process involving more than 350 public and private organizations in 40 countries, including funding agencies, universities, research centers and scientific associations, it includes principles for assessment criteria and processes such as “basing assessment primarily on qualitative judgment, for which peer review is central, supported by responsible use of quantitative indicators", and “core commitments” such as the decision to “abandon inappropriate uses in research assessment of journal- and publication-based metrics, in particular inappropriate uses of Journal Impact Factor (JIF) and h-index", and avoid the use of metrics used by international rankings, which are “inappropriate for assessing researchers”.

In the latest move on this front, in June 2023 the Global Research Council (GRC), which features the heads of the world’s leading science funding agencies, endorsed a Statement of Principles Recognizing and Rewarding Researchers. Noting the “important role” played by funders in “future-proofing research assessment procedures and in facilitating culture change”, the statement calls for “broad and holistic ways of recognizing and rewarding” researchers, “adapted to the relevant contexts in which the assessment takes place, such as the disciplinary field or career stage”. It also advocates “the promotion of equity, diversity, and inclusion” to guide all aspects of “responsible research approaches and practices” (read more at: agencia.fapesp.br/41543).

For Marcio de Castro Silva Filho, FAPESP’s Scientific Director, the purpose of scientific research goes well beyond simply seeking answers to questions and should take social and economic relevance to society into account. In this context, he explains, FAPESP has created a number of programs oriented toward strategic objectives and recently issued a new call for proposals to establish Science for Development Centers with the aim of fostering research centers geared to developing solutions to problems relevant to São Paulo state.

“This discussion has moved in the direction of using quantitative indicators of productivity and impact with great caution and consideration,” says Sergio Salles-Filho, a professor at the State University of Campinas (UNICAMP) and principal investigator for a research project on the “science of science”. 

“It’s now quite widely accepted that these metrics are biased in several dimensions, especially gender, language, origin and race. Indeed, there’s a field dedicated to measuring these biases, called the science of science or research on research. And there are researchers who propose the use of correction factors to minimize the distortion caused by these biases in assessment.”

According to Salles-Filho, SoS or RoR studies have shown that when quantitative indicators are the basic criteria, performance is more likely to be rated highly for male white native English speakers based in the UK or USA, where most high-impact journals are published. “That doesn’t mean the front-ranking scientists aren’t good. On the contrary, they’re outstanding in the sense of well above average. The problem is that there are a great many people who are also very good but will never be highlighted by these indicators,” he says.

Scientists whose research interests are highly relevant for the countries but not valued internationally tend to be “hidden” by this type of assessment, says Justin Axel-Berg, a researcher at USP and a member of the Metricas.edu Project. “If this is the only criterion used to define quality, researchers will be strongly incentivized not to meet local needs. When performance is assessed on the basis of bibliometric indicators, everything becomes a competition and important aspects like collaboration with laboratory colleagues or effectiveness as a teacher or thesis advisor take a back seat,” he said.

New paradigm

In Salles-Filho’s opinion, the criteria used to define excellence are changing globally. The “classic indicators” are being replaced by new metrics that more objectively evidence the relevance of research or an institution to society, although how quickly or intensely this change will come about is unclear at this point.

“USP moved up significantly in this year’s QS World University Rankings, jumping from 115th place to 85th. One reason for this was the inclusion of a new indicator to reflect sustainability. In addition, some indicators, such as employment outcomes and employer reputation, had their weightings increased. All universities want to score as highly as possible on these,” he says.

In the 13th QS Latin America & The Caribbean Ranking, published this month, USP is placed at the top of 430 institutions in 25 countries. In this case, too, a new indicator called “international research network” was included to reflect the degree to which institutions are internationalizing.

Integrative approach

Marcovitch advocates an integrative approach to researcher assessment by universities and other institutions, encompassing teaching, research, extension, culture and outreach, meaning by the latter the extent to which they “take content to society to help build a better future”.

“The main issue we’re discussing now is how to migrate from metrics based on results [outcomes, publications and citations] to impact-based metrics. Having an impact is having a constructive effect on living conditions in a society, such as during the COVID-19 public health crisis, when science provided answers to very concrete questions,” he says.

If research institutions want to be highly valued, they must listen to society and identify what it expects in the event of a radical crisis, he argues, adding that it was relatively easy to find out what society wanted during the pandemic – vaccination and appropriate hospital care – and scientific institutions responded satisfactorily.

“Other crises are very much part of society’s demands but less evident, starting with the demographic transition – the profound changes that are occurring in the demographic profile of Brazil and São Paulo, with significant social impacts on education and health, among other things,” he says. “The digital transition is another example, requiring new competencies to universalize access to digital technology and services, a key part of building the future. There’s also a socioeconomic transition, involving changes in labor relations, a wider gap between income levels, and growing polarization of mindsets across society. This polarization heightens the sense of insecurity. Lastly, the ecological transition requires a reduction in greenhouse gas emissions, an end to deforestation, and conservation of nature with priority for the Amazon and other biomes. The main challenge is identifying interlocutors in society and working with them to construct metrics for use in monitoring responses to their expectations.”

<p><strong>By Karina Toledo  |  Agência FAPESP</strong> – Although excellence is one of the most highly valued qualities in academia, it is hard to define. Indicators and metrics based on numbers of articles and citations emerged as a tool to facilitate this process in the mid-twentieth century, acquiring significance in the 1980s in defining relevant research that merits funding.</p>

<p>The first international university rankings appeared at the start of this century, in a movement led by Shanghai Jiao Tong University, China, which in 2003 presented the Academic Ranking of World Universities (<strong><a href="https://www.shanghairanking.com/" target="_blank">ARWU</a></strong>). European institutions took steps to catch up, and the following year saw the launch in the UK of the Times Higher Education Supplement World University Rankings, now known as Times Higher Education (<strong><a href="https://www.timeshighereducation.com/" target="_blank">THE</a></strong>), soon joined by <strong><a href="https://www.topuniversities.com/university-rankings" target="_blank">QS World University Rankings</a></strong>, compiled by global higher education analyst Quacquarelli Symonds. These two pioneers are currently among the most influential rankings.</p>

<p>The indicators and methodology used to evaluate the “world’s best universities” vary according to the purpose and origin of each ranking, explains <strong><a href="https://bv.fapesp.br/en/pesquisador/101283/jacques-marcovitch" target="_blank">Jacques Marcovitch</a></strong>, a professor at the University of São Paulo (USP) who coordinates the <strong><a href="https://metricas.usp.br/en/" target="_blank">Metricas.edu Project</a></strong>. With <strong><a href="https://bv.fapesp.br/en/auxilios/106039" target="_blank">FAPESP’s support</a></strong>, the team led by Marcovitch monitors and analyzes national and international comparisons of universities. Their aims include developing strategies to strengthen the performance of institutions in São Paulo – already the Brazilian leaders – and of other universities that participate in international comparisons.</p>

<p>“We can say that there are three major categories: commercial rankings [<em>e.g. THE, QS and newspaper </em>Folha de São Paulo<em>’s <strong><a href="https://ruf.folha.uol.com.br/todas-as-edicoes/" target="_blank">RUF</a></strong></em>], generally intended for families who want to choose a university for their children; national rankings [<em>e.g. ARWU and <strong><a href="https://www.scimagoir.com/" target="_blank">SCImago Institutions Rankings</a></strong></em>], constructed in accordance with the higher education missions and priorities of the country concerned; and academic rankings [<em>e.g. <strong><a href="https://www.umultirank.org/" target="_blank">U-Multirank</a></strong></em>], which are not intended to classify universities but to compare them in a sort of benchmarking exercise,” Marcovitch explains. He has written at length about the subject in a contribution to <em>Repensar a Universidade. Desempenho Acadêmico e Comparações Internacionais</em>, a collective volume <strong><a href="https://www.livrosabertos.sibi.usp.br/portaldelivrosUSP/catalog/view/224/203/937" target="_blank">available</a></strong> from USP’s <strong><a href="https://www.livrosabertos.sibi.usp.br/portaldelivrosUSP/about" target="_blank">Open Book Portal</a></strong>.</p>

<p>In the wake of university rankings, the world saw the appearance of platforms that claim to measure the individual performance of researchers, such as <strong><a href="https://www.adscientificindex.com/" target="_blank">AD Scientific Index</a></strong>, <strong><a href="https://research.com/" target="_blank">Research.com</a></strong> or <strong><a href="https://clarivate.com/highly-cited-researchers/" target="_blank">Highly Cited Researchers</a></strong>; the latter is compiled by a UK company called Clarivate Analytics. While they use different methodologies, time divisions and databases, these individual rankings are all strongly based on quantitative indicators, such as the number of published articles (productivity), total citations or article citations (impact), and the <em>h</em>-index or derivatives (productivity and impact combined).</p>

<p>The accuracy of this kind of approach began to be questioned more sharply in the last decade, and there were moves to make research assessment broader, more inclusive, and better aligned with local and global needs. An example is the Declaration on Research Assessment (<strong><a href="https://sfdora.org/about-dora/" target="_blank">DORA</a></strong>), developed during the December 2012 Annual Meeting of the American Society for Cell Biology in San Francisco, published in 2013 and signed since then by 23,993 individuals and organizations in 164 countries.</p>

<p>The San Francisco Declaration, which was ten years old on May 16, 2023, recommends elimination of the use of journal-based metrics such as journal impact factors as a surrogate measure of the quality of individual research articles, or to assess an individual scientist’s contributions, or in hiring, promotion or funding decisions.</p>

<p>In September 2014, researchers attending the 19th International Conference on Science and Technology Indicators in Leiden (Netherlands) published the <strong><a href="http://www.leidenmanifesto.org/" target="_blank">Leiden Manifesto on Research Metrics</a></strong>, which proposes ten principles for the measurement of research performance. The first states that “Quantitative evaluation should support qualitative, expert assessment”, and “assessors must not be tempted to cede decision-making to the numbers. Indicators must not substitute for informed judgment”. Another principle worth highlighting is the need to “Protect excellence in locally relevant research”, and to give due weight to "high-quality non-English literature”, especially where locally relevant research does not feature among the “hot topics” for high-impact journals. Examples in the Brazilian case include neglected diseases.</p>

<p>In Europe, the <strong><a href="https://www.eua.eu/downloads/news/2022_07_19_rra_agreement_final.pdf" target="_blank">Agreement on Reforming Research Assessment</a></strong>, drafted by Science Europe, the European University Association (EUA) and Dr Karen Stroobants, supported by the European Commission, was published in July 2022. Resulting from a co-creation process involving more than 350 public and private organizations in 40 countries, including funding agencies, universities, research centers and scientific associations, it includes principles for assessment criteria and processes such as “basing assessment primarily on qualitative judgment, for which peer review is central, supported by responsible use of quantitative indicators", and “core commitments” such as the decision to “abandon inappropriate uses in research assessment of journal- and publication-based metrics, in particular inappropriate uses of Journal Impact Factor (JIF) and <em>h</em>-index", and avoid the use of metrics used by international rankings, which are “inappropriate for assessing researchers”.</p>

<p>In the latest move on this front, in June 2023 the Global Research Council (GRC), which features the heads of the world’s leading science funding agencies, endorsed a <strong><a href="https://globalresearchcouncil.org/fileadmin/documents/GRC_Publications/SoP_Recognising_and_Rewarding_Researchers.pdf" target="_blank">Statement of Principles Recognizing and Rewarding Researchers</a></strong>. Noting the “important role” played by funders in “future-proofing research assessment procedures and in facilitating culture change”, the statement calls for “broad and holistic ways of recognizing and rewarding” researchers, “adapted to the relevant contexts in which the assessment takes place, such as the disciplinary field or career stage”. It also advocates “the promotion of equity, diversity, and inclusion” to guide all aspects of “responsible research approaches and practices” (<em>read</em> <em>more at: <strong><a href="https://agencia.fapesp.br/41543">agencia.fapesp.br/41543</a></strong></em>).</p>

<p>For <strong><a href="https://fapesp.br/16081" target="_blank">Marcio de Castro Silva Filho</a></strong>, FAPESP’s Scientific Director, the purpose of scientific research goes well beyond simply seeking answers to questions and should take social and economic relevance to society into account. In this context, he explains, FAPESP has created a number of programs oriented toward strategic objectives and recently issued a new call for proposals to establish Science for Development Centers with the aim of fostering research centers geared to developing solutions to problems relevant to São Paulo state.</p>

<p>“This discussion has moved in the direction of using quantitative indicators of productivity and impact with great caution and consideration,” says <strong><a href="https://bv.fapesp.br/en/pesquisador/8581/sergio-luiz-monteiro-salles-filho" target="_blank">Sergio Salles-Filho</a></strong>, a professor at the State University of Campinas (UNICAMP) and principal investigator for a <strong><a href="https://bv.fapesp.br/en/auxilios/109718" target="_blank">research project</a></strong> on the “science of science”. </p>

<p>“It’s now quite widely accepted that these metrics are biased in several dimensions, especially gender, language, origin and race. Indeed, there’s a field dedicated to measuring these biases, called the <strong><a href="https://www.cambridge.org/core/books/science-ofscience/572A745A6F97B55A263F5E86225E3F70" target="_blank">science of science</a></strong> or research on research. And there are researchers who propose the use of correction factors to minimize the distortion caused by these biases in assessment.”</p>

<p>According to Salles-Filho, SoS or RoR studies have shown that when quantitative indicators are the basic criteria, performance is more likely to be rated highly for male white native English speakers based in the UK or USA, where most high-impact journals are published. “That doesn’t mean the front-ranking scientists aren’t good. On the contrary, they’re outstanding in the sense of well above average. The problem is that there are a great many people who are also very good but will never be highlighted by these indicators,” he says.</p>

<p>Scientists whose research interests are highly relevant for the countries but not valued internationally tend to be “hidden” by this type of assessment, says Justin Axel-Berg, a researcher at USP and a member of the Metricas.edu Project. “If this is the only criterion used to define quality, researchers will be strongly incentivized not to meet local needs. When performance is assessed on the basis of bibliometric indicators, everything becomes a competition and important aspects like collaboration with laboratory colleagues or effectiveness as a teacher or thesis advisor take a back seat,” he said.</p>

<p><strong>New paradigm</strong></p>

<p>In Salles-Filho’s opinion, the criteria used to define excellence are changing globally. The “classic indicators” are being replaced by new metrics that more objectively evidence the relevance of research or an institution to society, although how quickly or intensely this change will come about is unclear at this point.</p>

<p>“USP moved up significantly in this year’s QS World University Rankings, jumping from 115th place to 85th. One reason for this was the inclusion of a new indicator to reflect sustainability. In addition, some indicators, such as employment outcomes and employer reputation, had their weightings increased. All universities want to score as highly as possible on these,” he says.</p>

<p>In the 13th QS Latin America & The Caribbean Ranking, published this month, USP is placed at the top of 430 institutions in 25 countries. In this case, too, a new indicator called “international research network” was included to reflect the degree to which institutions are internationalizing.</p>

<p><strong>Integrative approach</strong></p>

<p>Marcovitch advocates an integrative approach to researcher assessment by universities and other institutions, encompassing teaching, research, extension, culture and outreach, meaning by the latter the extent to which they “take content to society to help build a better future”.</p>

<p>“The main issue we’re discussing now is how to migrate from metrics based on results [<em>outcomes, publications and citations</em>] to impact-based metrics. Having an impact is having a constructive effect on living conditions in a society, such as during the COVID-19 public health crisis, when science provided answers to very concrete questions,” he says.</p>

<p>If research institutions want to be highly valued, they must listen to society and identify what it expects in the event of a radical crisis, he argues, adding that it was relatively easy to find out what society wanted during the pandemic – vaccination and appropriate hospital care – and scientific institutions responded satisfactorily.</p>

<p>“Other crises are very much part of society’s demands but less evident, starting with the demographic transition – the profound changes that are occurring in the demographic profile of Brazil and São Paulo, with significant social impacts on education and health, among other things,” he says. “The digital transition is another example, requiring new competencies to universalize access to digital technology and services, a key part of building the future. There’s also a socioeconomic transition, involving changes in labor relations, a wider gap between income levels, and growing polarization of mindsets across society. This polarization heightens the sense of insecurity. Lastly, the ecological transition requires a reduction in greenhouse gas emissions, an end to deforestation, and conservation of nature with priority for the Amazon and other biomes. The main challenge is identifying interlocutors in society and working with them to construct metrics for use in monitoring responses to their expectations.”</p>

<p> </p>