A noise audit of the peer review of a scientific article: a WPOM journal case study





peer review, evaluation of scientific journals, research evaluation, decision making, decision noise, making judgements


This study aims to be one of the first to analyse the noise level in the peer review process of scientific articles. Noise is defined as the undesired variability in the judgements made by professionals on the same topic or subject. We refer to evaluative judgements in which experts are expected to agree. This is what happens when we try to judge the quality of a scientific work. To measure noise, the only information needed is to have several judgements made by different people on the same case to analyse their dispersion (what Kahneman et al. call a noise audit). This was the procedure followed in this research. We asked a set of reviewers from the journal WPOM (Working Papers on Operations Management) to review the same manuscript which had been previously accepted for publication in this journal, although the reviewers were unaware of that fact. The results indicated that if two reviewers were used, the probability of this manuscript not being published would be close to 8%, while the probability of it having an uncertain future would be 40% (one favorable opinion and one unfavorable opinion or both suggesting substantial changes). In the case of employing only one reviewer, in 25% of the cases, the audited work would have encountered significant challenges for publication. The great advantage of measuring noise is, once measured, it is usually possible to reduce it. This article concludes by outlining some of the measures which can be put in place by scientific journals to improve their peer review processes.


Download data is not yet available.

Author Biographies

Tomas Bonavia, University of Valencia

Department of Social Psychology

Juan A. Marin-Garcia, Universitat Politècnica de València

ROGLE Departamento de Organización de Empresas

ResearcherID A-4069-2011 Scopus Author ID 14024595300



Álvarez, S.M.; Maheut, J. (2022). Protocol: Systematic literature review of the application of the mul-ticriteria decision analysis methodology in the evaluation of urban freight logistics initiatives. WPOM-Working Papers on Operations Management, 13(2), 86-107. https://doi.org/10.4995/wpom.16780

Ariely, D. (2008). Las trampas del deseo. Cómo controlar los impulsos irracionales que nos llevan al error. Ed. Ariel.

Bedeian, A.G. (2004). Peer review and the social construction of knowledge in the management disci-pline. Academy of Management Learning & Education, 3(2), 198-216. https://doi.org/10.5465/amle.2004.13500489

Belur, J.; Tompson, L.; Thornton, A.; Simon, M. (2021). Interrater reliability in systematic review meth-odology: Exploring variation in coder decision-making. Sociological Methods & Research, 50(2), 837-865. https://doi.org/10.1177/0049124118799372

Benda, W.G.G.; Engels, T.C.E. (2011). The predictive validity of peer review: A selective review of the judgmental forecasting qualities of peers, and implications for innovation in science. International Jour-nal of Forecasting, 27(1), 166-182. https://doi.org/10.1016/j.ijforecast.2010.03.003

Bornmann, L. (2011). Scientific peer review. Annual Review of Information Science and Technology, 45(1), 197-245. https://doi.org/10.1002/aris.2011.1440450112

Ernst, E., Saradeth, T., & Resch, K. L. (1993). Drawbacks of peer review. Nature, 363(6427), 296. https://doi.org/10.1038/363296a0

Fiske, D.W.; Fogg, L. (1990). But the reviewers are making different criticisms of my paper: Diversity and uniqueness in reviewer comments. American Psychologist, 45(5), 591-598. https://doi.org/10.1037/0003-066X.45.5.591

Hirst, A.; Altman, D.G. (2012). Are peer reviewers encouraged to use reporting guidelines? A survey of 116 health research journals. PLoS ONE, 7(4), e35621. https://doi.org/10.1371/journal.pone.0035621

LeBreton, J.M.; Senter, J.L. (2008). Answers to 20 questions about interrater reliability and interrater agreement. Organizational Research Methods, 11(4), 815-852. http://orm.sagepub.com/cgi/content/abstract/11/4/815 https://doi.org/10.1177/1094428106296642

Kahneman, D. (2012). Pensar rápido, pensar despacio. Ed. Debate.

Kahneman D.; Rosenfield A.M.; Gandhi L.; Blaser T. (2016). Noise: How to overcome the high, hidden cost of inconsistent decision making. Harvard Business Review, 94(10), 38-46.

Kahneman, D.; Sibony, O.; Sunstein, C.R. (2021). Ruido. Un fallo en el juicio humano. Ed. Debate.

Krippendorff, K. (2011). Computing Krippendorff's alpha-reliability. Retrieved from https://repository.upenn.edu/asc_papers/43

Marin-Garcia, J.A.; Santandreu-Mascarell, C. (2015). What do we know about rubrics used in higher education? Intangible Capital, 11(1), 118-145. https://doi.org/10.3926/ic.538

Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.; PRISMA Group (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. Annals of Internal Medicine, 151(4), 264-269, https://doi.org/10.7326/0003-4819-151-4-200908180-00135

Rezaei, A.R.; Lovorn, M. (2010). Reliability and validity of rubrics for assessment through writing. As-sessing Writing, 15(1), 18-39. https://doi.org/10.1016/j.asw.2010.01.003

Voskuijl, O.F.; Van Sliedregt, T. (2002). Determinants of interrater reliability of job analysis: A meta-analysis. European Journal of Psychological Assessment, 18(1), 52-62. https://doi.org/10.1027//1015-5759.18.1.52

Weller, A.C. (2001). Editorial peer review: its strengths and weaknesses. Ed. American Society for In-formation Science and Technology.




How to Cite

Bonavia, T., & Marin-Garcia, J. A. (2023). A noise audit of the peer review of a scientific article: a WPOM journal case study. WPOM-Working Papers on Operations Management, 14(2), 137–166. https://doi.org/10.4995/wpom.19631



Case Report Papers

Most read articles by the same author(s)

1 2 3 4 5 > >>