Written Evidence Submitted by Eastern Arc
Eastern Arc is the regional research consortium comprising the universities of East Anglia, Essex and Kent. The response below is collective, following discussions with academics and professional services staff at the three universities.
This response is being submitted by the Director of Eastern Arc, Phil Ward, on behalf of a group of academics and professional services staff across the three universities.
Question 1: What is the breadth of the reproducibility crisis and what research areas is it most prevalent in?
The depth of the crisis depends on how you define reproducibility, and there is ‘no generally accepted, scientific standard for determining whether previous research is reproducible/replicable’ (Duvendack et al 2017, p47). However, it is useful to understand the difference between research that is reproducible, replicable and generalizable.
● Reproducible research is that for which results can be duplicated using the same materials and procedures as were used by the original investigator;
● Replicable research is that for which results can be duplicated using the same procedures but new data;
● Generalizable research is that for which results can be applied to other populations, contexts, and time frames (Bollen 2015).
These terms are sometimes used interchangeably and there are different conceptions of what these terms mean with a lack of consensus in the literature on an agreed definition. In addition, attempts to reproduce research can be ‘narrow’ or ‘wide’ (Pesaran 2003), pure, statistical or scientific (Hamermesh 2007).
Nevertheless, the crisis is significant and there is a need to address it. A Nature survey found that 70% of respondents had ‘tried and failed to reproduce another scientist's experiments,’ although ‘73% said that they think that at least half of the papers in their field can be trusted’ (Baker 2016).
Issues of definition and the breadth of any survey ‘makes it difficult to determine replication rates within a discipline’ (Duvendack et al 2017, p47). However, psychology has been cited as particularly problematic (Stanley et al 2018), as ‘studies of a given psychological phenomenon can never be direct or exact replications of one another’ (McShane et al 2017). Economics has also been under the spotlight: Duvendack, Palmer-Jones and Reed (2015), as well as Chang and Li (2015), measured replication rates in economics, reporting low success rates of replication, with Camerer et al (2016) being slightly more successful attempting to replicate 18 studies in experimental economics.
Question 2: What are the issues in academia that have led to the reproducibility crisis?
There are a number of reasons for the publication of findings that cannot be reproduced. Broadly, they fall into two categories: those that are instrumental and those that are accidental.
● Instrumental: The number of global research outputs is doubling every nine years (Bornmann and Mutz 2015). At the same time, academic careers are increasingly precarious (OECD 2021). As a result there is a strong drive to get research noticed, funded and published, and thereby raise an academic’s profile. An important way of doing so is to produce research with innovative or interesting results that has the potential to disrupt accepted paradigms. Such work is more likely to be accepted for publication, chosen for funding, and picked up by mainstream media.
This ‘publication bias’ has led to statistically significant and often positive results taking precedence. At the same time, research that disproves such research tends to be less-well regarded (Dewald, Thursby and Anderson 1986). Not only that, but there are fewer publications that publish replication studies (see Duvendack, Palmer-Jones and Reed 2015 & 2017 for a discussion on which economics journals publish replications), and researchers that undertake such work may be seen as distrustful and/or malevolent (Duvendack et al 2017), and have been characterized as “research parasites” (Longo and Drazen 2016).
That is not to say that researchers who seek to publish research emphasizing statistically significant findings are necessarily conscious of falling prey to the ‘cult of statistical significance’ (Ziliak and McCloskey 2008). However, they are working in an environment where statistically significant and thereby innovative results are more likely to be viewed with interest, which leads to ‘HARK-ing’, or hypothesizing after the results are known (Kerr 1998).
● Accidental: Not all unreproducible results are deliberately fostered and favoured. The ATCC (n.d.) highlighted a number of issues around training, resources and supervision that may have led to such results. These included a lack of access to methodological details, raw data, and research materials; use of misidentified, cross-contaminated, or over-passaged cell lines and microorganisms; an inability to manage complex datasets; and poor research practices and experimental design.
Question 3: What is the role of the following in addressing the reproducibility crisis?
● Research funders, including public funding bodies
Research funders can play a central part in addressing the crisis, and many are already making important steps to doing so. The open access and open data movements have helped increase transparency in the research process, and the introduction of ‘data management plans’ (DMPs) by UKRI has enabled others to attempt to more fully interrogate findings and reproduce results.
However, as with publishers, there is a danger of favouring counter-intuitive research proposals in the initial selection process. There is also little interest in funding studies that attempt to reproduce results as perceived to be less novel than original research, thus dis-incentivising researchers to engage in reproducibility. This feeds into how publishers view reproducibility; they often fear that replications will be less frequently cited thus having an impact on a journal’s impact factor which is calculated on the basis of the number of citations of articles published in a given journal.
In addition, there is often an unconscious bias in the peer review process for particular individuals, research groups, universities, fields of research or methodologies that may mean that some proposals get less rigorous scrutiny. Wellcome and UKRI are making good steps in addressing this in its people and culture strategies, as well as the increasingly wide adoption of the San Francisco Declaration on Research Assessment (DORA) that identifies the increased ‘momentum toward more sophisticated and meaningful approaches to research evaluation that can now be built upon and adopted by all of the key constituencies involved.’
There is also the problem of enforcement. Although all applicants have to complete DMPs, the checks on whether data has been deposited in appropriate repositories are weak. There is also a need to deposit code as well as data; the latter is of limited value without the former.
● Research institutions and groups
The higher education sector has become increasingly marketised. As a result, research institutions are having to position themselves positively against the competition, including their performance in league tables. This has had a number of unintended consequences, including a push to get research funding and increase citation rates.
This is particularly apparent in the way that institutions position themselves for the Research Excellence Framework (REF). Addressing the perverse incentives that result from this is partly the responsibility of government (see below), but it is also the responsibility of individual institutions to resist these pressures and give people the space and freedom to undertake robust research that may be slow in producing results, citations and grants.
● Individual researchers
As Duvendack et al (2017) point out, ‘while there are increasing calls for journals to improve data sharing and transparency, there is also significant resistance among researchers, as evidenced by opposition to the adoption of a [data access policy] at top finance journals...and the online petition against the data access and research transparency (DA-RT) initiative in political science’ (p48).
Such attitudes are changing, and it is important that individual researchers are willing to change their behaviours to support openness and transparency in their research. We outline some of the ways they can do this in our answer to Q4, below.
As with funders, publishers are central to changing their framework and processes to embed good practice in reproducibility. In some areas, there are few journals that publish replications, or make data and code available (Duvendack et al 2015 & 2017). However, there are significant changes happening, such as pre-registration of research and, in some cases, journals accepting articles for publication before they have been written, based on an outline of the proposed research. This helps to address the issue of ‘publication bias’, discussed above.
It should be recognised that some progress has been made in these areas. For instance, a ‘TOP Factor’ has been created that scores journals on ten different criteria, including availability of data and policies on pre-registration.
● Governments and the need for a unilateral response / action
Government policy and resultant legislation sets the framework in which research takes place, and it is therefore crucial for putting in place measures that will necessitate a change of culture. Most current policies, strategies, white papers and roadmaps, including the R&D Roadmap, Innovation Strategy, Plan for Growth, Integrated Review and Data Saves Lives Strategy, do not address this.
Beyond policy and legislation, the government can set the tone and encourage good behaviour. The current consultation on the future of research assessment needs to recognise the unintended consequences and perverse incentives of the existing system, and consider working with individual universities and research organisations, with publishers and with funders to develop a system that will change the framework and embed a culture that favours transparent and robust research.
Question 4: What policies or schemes could have a positive impact on academia’s approach to reproducible research?
There is a wide range of possible policies, actions, schemes and incentives to help change academia’s approach to reproducible research. These include:
● Funders providing discrete funding for reproduction studies. Some funders are already doing so, such as the International Initiative for Impact Evaluation (3ie), but there needs to be a wider adoption of such policies.
● Funders developing a database of underused software and hardware, which may be necessary for the analysis of specific data as part of a reproduction study;
● Funders and publishers making the reviewing and enforcement of DMPs and DAPs more robust;
● Publishers mandating pre-registration and accepting articles for publication based on an outline of research;
● Publishers employing staff and/or students to routinely run the code on data submitted, as is currently undertaken by a small number of journals, such as The American Economic Review, and some journals exist entirely to replicate findings, such as The International Journal for Re-Views in Empirical Economics.
● Institutions being encouraged to work together to produce common policies and monitoring, and be required to integrate open and reproducible research practises into their incentive structures at all career levels.This should be embedded into their research ethics and also involve other staff, including technicians and data managers.
● Individuals changing the way that postgraduate students and early career researchers are training in research methodologies and publication strategies. The Berkeley Initiative for Transparency in the Social Sciences (BITSS) has developed a textbook that is intended to train people in undertaking open science, and other resources exist to support those teaching students about replication.
● Government working to improve the scientific literacy of politicians, policymakers and civil servants. Myers and Coffé (2021) highlight the fact that, ‘of the 541 MPs with higher education degrees in the 2015-2017 Parliament, only 93 (17%) held degrees in STEM subjects; for comparison, 46% of UK students in 2019 graduated in STEM subjects.’ Such a STEM grounding would enable a better understanding of the ‘science’ underlying research fundings; without it there is a tendency to accept the results at face value and act accordingly. There is a need to ‘embrace uncertainty’ and accept that results are not necessarily clear cut;
● Representative bodies for research disciplines mandating the use of systematic reviews, such as the Cochrane or Campbell Collaborations, or the Open Synthesis group, to review, rate, synthesise and publish best available evidence.
Question 5: How would establishing a national committee on research integrity under UKRI impact the reproducibility crisis?
This is difficult to answer without knowing the remit and parameters for such a committee. In principle it would be welcomed, and should address key issues around the public funding of reproducible studies. However, the Government should examine existing bodies (such as the UK Reproducibility Network (UKRN), BITSS, or the Replication Network) before establishing a new body, to avoid duplication.
ATCC. No date. “Six factors affecting reproducibility in life science research and how to handle them.” Nature Portfolio [online] Accessed 20 Sept 2021: https://www.atcc.org/the-science/authentication
Baker, M. 2016. “1,500 scientists lift the lid on reproducibility.” Nature 533, 452–454.
Bollen, K., Cacioppo, J. T., Kaplan, R. M., Krosnick, J. A., & Olds, J. L. 2015. Social, behavioral, and economic sciences perspectives on robust and reliable science [Report of the Subcommittee on Replicability in Science Advisory Committee to the National Science Foundation Directorate for Social, Behavioral, and Economic Sciences]. National Science Foundation.
Bornmann, L, and Mutz, R. 2015. “Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references”. Journal of the Association of Information Science and Technology 66 (11): 2215-2222
Camerer, C.F., Dreber, A., Forsell, E., Ho, T., Huber, J., Johannesson, M., …, Wu, H., 2016. Evaluating Replicability of Laboratory Experiments in Economics. Science, 351(6280): 1433-1436.
Chang, A.C. & Li, P., 2015. Is Economics Research Replicable? Sixty Published Papers from Thirteen Journals Say “Usually Not”. Finance and Economics Discussion Series 2015-083. Washington: Board of Governors of the Federal Reserve System. Retrieved from http://dx.doi.org/10.17016/FEDS.2015.083
Duvendack, Maren et al. 2017. "What Is Meant by "Replication" and Why Does It Encounter Resistance in Economics?" American Economic Review, 107 (5): 46-51.
Duvendack, M., Palmer-Jones, R. & Reed, W.R., 2015. Replications in Economics: A Progress Report. Econ Journal Watch, 12(2): 164-191
Hamermesh, Daniel S. 2007. “Viewpoint: Replication in Economics.” Canadian Journal of
Economics 40 (3): 715–33.
Joshua, M. and Hilde, C. 2021. “MPs with both an educational and occupational background in STEM are the most likely to demonstrate engagement with STEM issues in Parliament.” [online] Impact of Social Sciences. Accessed 24 September 2021. https://blogs.lse.ac.uk/impactofsocialsciences/2021/08/16/mps-with-both-an-educational-and-occupational-background-in-stem-are-the-most-likely-to-demonstrate-engagement-with-stem-issues-in-parliament/
Kerr, Norbert L. 1998. “HARKing: Hypothesizing After the Results Are Known.” Personality and Social Psychology Review 2 (3): 196–217.
Longo, Dan L., and Jeffrey M. Drazen. 2016. “Data Sharing.” New England Journal of Medicine 374: 276–77.
McShane, B. et al. 2017. “Large-Scale Replication Projects in Contemporary Psychological Research.” The American Statistician 73: 99 - 105.
OECD. 2021. “Reducing the precarity of academic research careers.” OECD Science, Technology and Industry Policy Papers. https://doi.org/10.1787/0f8bd468-en
Pesaran, Hashem. 2003. “Introducing a Replication Section.” Journal of Applied Econometrics 18 (1): 111.
Stanley, T. et al. 2018. “What Meta-Analyses Reveal About the Replicability of Psychological Research.” Psychological Bulletin 144: 1325–1346.
Ziliak, S., & McCloskey, D. N. 2008. The cult of statistical significance: How the standard error costs us jobs, justice, and lives. University of Michigan Press.