Watch: Has generative AI already transformed research assessment and will it unlock the REF’s regulatory burden?

As part of our Campaign for Social Science events programme, earlier this week we held a webinar featuring Professor Richard Watermeyer presenting findings from a recent study of how generative AI (GenAI) is impacting and shaping the UK’s Research Excellence Framework (REF) 2029.

Chaired by Rose Stephenson, Director of Policy and Strategy at the Higher Education Policy Institute, this was the second webinar this year in our series, in partnership with the UK Evaluation Society and the Social Research Association, on the theme of how we can evaluate, understand, and manage different aspects and uses of AI as it continues to rapidly change our economy and our society.

Richard is Professor of Education at the University of Bristol and Co-Director of the Centre for Higher Education Transformations. He also led the REF AI research project which was focused on understanding “existing, emerging and prospective applications of, and attitudes towards, the utilisation of GenAI tools for purposes of institutional REF preparations and the development of REF submissions, but also in the context of the REF formal assessment process itself”, in terms of peer review, evaluating outputs, impact case studies and more.

Richard began by outlining the context in which the REF AI research project came about, explaining that there was previously a lack of knowledge of how AI tools are being deployed in relation to the REF, as well as the perceptions of GenAI tools and experience of their use across UK higher education. He also pointed to concerns around the financial challenges facing many institutions in the higher education sector, and whether the positive effects of the REF may have been exhausted, as well as the discontinuation of other performance-based research funding systems elsewhere in the world, including in Australia and New Zealand.

He said, “The project also emerged as a response to the many concerns […] around the continued fitness of the purpose of the REF […] in terms of its affordability, in terms of the efficiency of its process and term, in the context of it being significantly labour intensive both at the level of institutional level preparations, but also in the context of 700 plus reviewers coming together over a prolonged period of time.”

Richard went on to describe the methods used in the rapid qualitative investigation, which took place between June and September 2025, and constituted 16 in-person institutional visits across a range of Russel Group, post-92 and research intensive and specialist institutions spanning the four nations of the UK. Data was collected through focus groups involving over 200 senior academics and professional services staff with direct responsibility for REF within their institutions, 32 interviews with Pro Vice Chancellors and institutional REF leads, as well as a national online survey which received responses from an additional 400 academic and professional services staff working within UK higher education.

Following this, Richard summarised the core findings from the study beginning with an overview that there was equal parts enthusiasm but also anxiety when it came to the use of AI in a REF context.

He said, “There is a pervasive view that AI is coming fast and is already probably ever present, if in subterranean ways or less visible ways should we say, within institutions, but also that institutions are largely unevenly prepared in terms of their response to the integration of these tools for the purpose of research assessment.”

Richard explained that the use of AI tools for the purposes of REF varied dramatically between institutions and that this was strongly linked to levels of confidence in its use, which was often connected to institutional resourcing and whether institutional leaders were champions of the use of these tools.

He said, “Curiously, although there’s a fairly high degree of limited infrastructure and certainly low organisational maturity in apropos of the use of these tools for the REF, we did discover in some institutions the development of already operationalisation of in-house tools for AI that was specifically being utilised for dimensions of the REF, like output review, or for instance, impact evidence gathering and the creation of impact case studies.”

Richard also pointed to findings on the perceived risks identified by participants in the research such as bias and inaccuracy of GenAI models, data protection risks for staff submitting unpublished research into publicly available AI tools; and a sense that AI would still be used, albeit covertly, by staff if formally discouraged by their institution. He also explored significant differences in resource capacity, in-house expertise and time allocated for REF between institutions, and associated concerns of there being a disparity in AI capacity and capability therefore between institutions. He commented that their research showed that whilst higher resourced institutions saw the possibilities that AI could afford in terms of reducing workload by delegating to the tools, lower resourced institutions don’t necessarily have the capacity or resource to explore them. As such, participants in the focus groups highlighted that inequality in the utilisation of AI tools could further inequities in preparation for REF or result in a lack of capability to respond to REF demands.

He said, “Less resourced institutions worried much more that GenAI uptake will further deepen existing REF inequalities and institutional variation, not in terms of the quality of the research itself or the research environment or the kinds of impact, but the resources within institutions available to actually submit competitive REF submissions.”

In addition, Richard explored concerns around a lack of guidelines or guardrails for the use of AI in this context and that there was an overall sense of low AI literacy among staff involved in REF tasks.

Richard ended his presentation by summarising the key findings from his research including the differences in attitudes towards incorporation of GenAI in the REF, and that opposition was strongest among academics, particularly those in Russell Group institutions, those from the humanities, arts and social sciences, and those who were infrequent or non-users of AI. However, despite the opposition he pointed to the fact that the research showed there is broad consensus that GenAI tools are already being widely used for institutional REF preparations; that it’s highly likely, given technological advancement, that REF panellists will make use of GenAI tools in 2029; and that in the future REF will become fully automated. He also highlighted that there needs to be a recognition that utilisation of AI tools is not without costs, in terms of infrastructure, staff training, and subscriptions.

He said, “Finally, unsurprisingly, and also perhaps quite positively, there’s a sense that despite suggestions that the REF might be fully automated [in the future], for the present time at least, the REF will still require human stewarding and oversight not least in terms of ensuring that the utilisation of these tools remain appropriate.”

A Q&A followed Richard’s presentation, and the full webinar is available to watch below.