By John Holmwood and Stephen McKay
In this analysis of REF 2014, we extend the analysis of disparities across sub-panels that we recently published in a blog for the Times Higher. We show how these outcomes are not ‘facts’ about the relative standing of the social sciences, but artifacts of a flawed exercise.
Notwithstanding its severe limitations and pronounced neo-liberal features set out by Derek Sayer, among others, the UK Research Excellence Framework (REF) matters in its consequences. This is because it is one of the means by which research funding is distributed (with greater funding concentrated under the current formula on the proportion of 4* activity within a Unit of Assessment). It also makes (and breaks) the reputations of departments, and their institutions more generally, as well as providing judgments of the relative standing of different subject areas. Indeed, it may also contribute to decisions leading to the break-up and even closure of departments, especially in the context of more fragile and fluctuating undergraduate demand.
REF2014 introduced new arrangements for the evaluation of research over that of RAE2008, making comparison of results more difficult. Most importantly, it introduced criteria associated with the impact of research upon users and external, non-academic, audiences. This replaced the previously separate esteem indicators (which were absorbed into research environment) and was evaluated via case studies (2 for the first 14 FTE staff submitted, with a further case study for each additional 1-10 FTEs; ie 3 for 15-24 FTEs, 4 for 24-35, etc) and a statement of strategies used to promote impact within the unit of assessment. This had two consequences. The first was to encourage ‘reverse engineering’ of submissions, limiting the number of staff submitted in order to manage the number of case studies required. The second was to give each case study a disproportionate impact upon overall grade point averages, when compared to outputs.
REF2014 also merged some Panels (reducing their number from 67 to 36) and created a new system of Main Panels and Sub-Panels. The former had the responsibility of overseeing the application of common criteria and practices by the different subject Sub-Panels under their remit. In this context, Sub-Panels functioned as an aggregate of markers of outputs, impact case studies and research and impact environments. They had little role in reflecting upon the overall pattern of outcomes, a role that was undertaken by the Main Panel (as in the previous RAE2008, when Main Panels were first introduced). There were extensive mechanisms and test exercises in place to ‘calibrate’ judgements across panel members and across Sub-Panels prior to the marking exercise beginning. However, very little was done to check that this calibration was maintained in practice and, it would seem, little checking of the results for consistency.
The social sciences submitted to Main Panel C and were organized as a number of ‘core’ disciplinary sub-panels (Economics, Politics and International Studies, Sociology) and ‘interdisciplinary’ applied areas (Social Work and Social Policy; Education; Business and Management Studies; Sport and Exercise Sciences; Leisure and Tourism), as well as ‘hybrid’ sub-panels where the ‘core subject was combined with its associated applied area (for example, Anthropology and Development Studies; Geography, Environmental Studies and Archaeology; and Architecture, Built Environment and Planning, with Law a somewhat separate case). An initial proposal to merge Sociology with Social Work and Social Policy was blocked by the latter rather late in the day.
In the analysis that follows, we will concentrate on the relation between the Sociology and Social Work and Social Policy Sub-Panels, although we will also present data from other Sub-Panels within Main Panel C. It is our contention that Main Panel C failed to maintain consistency in approach across the different sub-panels.
For a number of commentators, including one of the present authors (Holmwood 2010), there are particular concerns for sociology as a discipline in a context where Social Work and Social Policy is also available as an alternative subject panel for submission. Indeed, the number of UofAs submitting to the Sociology panel had declined from 67 in RAE1992, to 61 in RAE1996, 48 in RAE 2001, 39 in RAE 2008, with a further fall to 29 in REF2014. From 2008 to 2014 there was a reduction in FTE staff from 927 to 704. Social Work and Social Policy were only combined from 2008, with separate panels for 2001 and earlier (when there was 958 social policy staff in 47 submissions and 383 social work staff in 30 submissions), so such a consistent time-series of submissions is not possible. There was a slight fall from 68 in RAE2008 to 62 in REF 2014, partly because of multiple submissions by institutions in 2008 compared with 2014. There were 1243 FTE Social Work and Social Policy staff submitted in 2008, rising to 1302 in 2014.
This decline in Sociology submissions is largely accounted for by the withdrawal of those Units of Assessment that had appeared in the bottom part of the Sociology rank order in RAE2008. In consequence, Sociology submissions in REF 2014 were disproportionately from research intensive institutions that expect to do well in overall rankings of institutions across all their submissions (and, indeed, generally did so). Yet, our analysis shows that Sociology did significantly worse than other social science disciplines and, even where UofAs ranked highly within the Sociology rankings, they did so with lower grade point average scores and grade profiles than other high ranking subjects within their local institution. Social Work and Social Policy did considerably better and was one of the highest ranking subjects overall, and not only among the social sciences, partly in terms of its treatment of its highest rated submissions. This contrasts with its performance in RAE2001, when it was one of the lowest performing subjects in the social sciences (See McKay 2006). Politics seems to have been the class laggard in the 2008 exercise, perhaps somewhat rectified in 2014.
It seems that there was ‘learning’ by Panels across exercises. Indeed, there was a general inflation of grades between RAE2008 and REF 2014, which HEFCE justifies by arguing that international citations have increased across the same period. They also claim that there is consistency across Panels and Sub-Panels. The implication is that different scores represent real differences in ‘quality’ across the different subjects, something HEFCE sought to ensure its calibration exercise. HEFCE is content with consistency being demonstrated through a comparison of the average proportion of 3* and 4* scores, without any discussion of whether there are any differences in the types of institutions submitted to different panels. Moreover, their analysis merges 3*/4* scores, and the real discrepancy is in terms of differences in allocating 4* scores. This is also the category that shows the biggest proportionate increase across the 2008 and 2014 exercises, rising by 42% for outputs (impact case study scores, of course, are new for 2014). It is also the category with the greatest financial return, with the general increase in scores meaning there is likely to be a further tightening of the formula and, possibly, even greater concentration toward 4* (that formula was originally 1:3:7 for 2*/3*/4* scores, later changed to 1:3:9 and then to 0:3:9).
Despite this being a crucial issue for HEFCE oversight, the different Sub-Panels of Main Panel C have very different patterns with regard to scoring 4* output, with Social Work and Social Policy showing a significant increase, Politics and International Studies a moderate increase and Sociology very little increase (Anthropology and Development Studies seems to show grade deflation, but comparisons are harder to make because of the aggregation of Anthropology and Development Studies Sub-Panels compared with 2008).
The following charts show percentages of 4* outcomes for the top 20 UofAs in each of the Sub-Panels for Social Work and Social Policy, Politics and International Studies, and Sociology in RAE 2008 and REF 2014. We follow the RAE reporting convention of expressing results to the nearest 5%, and hence the ‘stepped’ nature of the 2008 lines.
Of course, we are not comparing like with like when we describe the top 20 UofAs at each Sub-Panel, or indeed of any Sub-Panel. In the case of Social Work and Social Policy, that comprises about one third of the overall number submitted, compared with two thirds of the Sociology submissions. Equally, any other number (or indeed taking all) would also be somewhat arbitrary. But what matters is the slope of the curves in each case and especially the gap between the curves for 2008 and 2014.
These differences are unlikely to be accounted for by differences in types of publication or authorship practices. Data from REF2014 has only just become available, and early analysis shows that the proportion of monographs ranged from the low end in Business and Management and Sports and Exercise Sciences, etc. Each had just 1% of outputs as monographs, while the corresponding figure for Sociology was 13% and Social Policy and Social Work 9%, with Anthropology at 11% and Politics and International Studies at 18% (each showing a slight fall on 2008 from 17%, 12%, 15% and 22% respectively). Book chapters, as distinct from journal articles, were 9% for Social Policy and Social Work and Sociology alike, with Politics and International Studies at 10%, but 16% for Anthropology and 1% for Business and Management and Sports and Exercise Sciences. 84% of Sports and Exercise Sciences and 80% of Business and management studies outputs were jointly authored compared with 58% for Social Work and Social Policy, 40% for Sociology, 40% for Anthropology and 30% for Politics and International Studies (there were marginal increases in joint authorship in Sociology and Social Policy and Social Work, up from 56% and 37% respectively in 2008, and more significant increases for Anthropology and Politics and International Studies, up from 29% and 22% respectively).
HEFCE data also shows that the average grade profile for Sociology and for Social Work and Social Policy is very similar. The data for Sociology show the following proportions at 4* for overall, outputs, impact, and environment (in that order): 27, 19.7, 43.2, 35.1. The proportions for Social Work and Social Policy are: 27, 19.4, 43.8, 36.9. But this is not comparing like for like, once we take account of the different institution profiles of the two submissions, with Sociology having a higher proportion of submissions from research-intensive institutions and, therefore, an expectation of higher scores (an expectation realized for other subjects at those institutions). Within sociology, 13/29 submissions were from the Russell Group, covering 52% of staff; a further 7% were at Essex, surely a candidate for any ‘elite’ social science grouping (e.g. the top recipient of ESRC funding in 2013-14 by some margin). In Social Work and Social Policy 15/62 submissions were Russell Group, covering 36% of staff. For Panel C the average was 48% of staff (and ranging from 67% in Economics to 15% in Sport & Exercise Sciences, Leisure and Tourism).
The following graphs show another way of describing the differences across Subject Sub-Panels. Sociology (along with Anthropology) provides the most ‘concentrated’ distribution of grades when compared with other subjects.
Another way of looking at this is to calculate the gini-coefficients for a particular outcome, and we look at the overall proportion of 4* awarded. Economics and Econometrics is the most ‘inegalitarian’ at 0.41, followed by Social Work and Social Policy at 0.38, Law at 0.34, Politics and International Studies at 0.33, and Business and Management Studies at 0.32. Sociology is the ‘outlier’ at 0.22 (with Anthropology also very equal, at 0.20). Put simply, the problem is less that the scores are more concentrated than other Sub-Panels, since that might be expected given the concentration of research-intensive institutions, but that the scores should be shifted over to the right.
Insofar as there was ‘grade inflation’ in sociology, it was at the lower end. This is what happened to those with 20% or more at 4*, who were still submitting to this panel in 2014. Average change in this group was just +1%.
It’s a smaller group of 10, but for Social Work and Social Policy, average change was +19%.
There was also more ‘churning’ of scores in Sociology. 5 of the 16 went down with none of the similar group in Social Work and Social Policy sharing this fate.
In REF2014 a number of submissions achieved ‘perfect’ scores for either impact or environment – specifically 100% at 4*. This was less common for Impact than for Environment, as, after all, the former was a new topic for REF panels. It seems that some panels were quite reticent about awarding these perfect scores, with others less so. Some 9.3% of Social Work and Social Policy staff worked within units that achieved 100% at 4* for impact, which was more than twice as high a proportion as any other submission to panel C. Indeed it was the highest of any REF unit of assessment (Area Studies was next at 7%; then down to 5.5%). For Sociology it was 2.8%. Our suggestion is that the ‘scoring tendencies’ evident with regard to outputs, were carried over into other elements of the REF, such that the differential ‘inflation’ identified with regard to ‘outputs’ that occurred between 2008 and 2014 is implicit in impact and other scores.
Within Panel C, 12.8% of staff worked in units with a perfect score for research environment. This ranged from close to one-third of staff in Education (a subject with a notoriously low ESRC success rate, at 4% compared with 23% for Sociology and for Social Policy, according to the ESRC Annual Report and Accounts for 2013-14), down to none within the Architecture sub-panel. Social Work and Social Policy again did rather well (23%) and Sociology again fared poorly (7%). All of LSE’s Social Work and Social Policy research environment is world-leading, but only half of its Sociology. All of Oxford’s Social Work and Social Policy research environment is world-leading, but only 62.5% of its sociology. Yet ESRC’s peer evaluation of research at end of grant in 2013-14 assigned rather better scores to Sociology than to Social Policy (11% outstanding, 46% very good, 39% good and 4% satisfactory for Sociology, with Social Policy scoring, respectively, 8%, 46%, 23% and 23%).
As we have already commented, submissions to the Sociology Sub-Panel declined from 39 to 29, with some UofAs deciding to submit to another Panel most usually Social Work and Social Policy (though some sociologists were submitted to other Sub-Panels, or previous submissions distributed across Sub-Panels). The consequence is that they are likely to be better rewarded, for giving up a primary definition of themselves as Sociology, but only where they did better than the average (since lower-ranked UofAs tend to be scored less well by the more ‘inegalitarian’ Social Work and Social Policy Sub-Panel). Moreover, those UofAs that remained with the Sociology Panel and emerged as highly ranked by them are now likely to receive less funding than if they had submitted to Social Work and Social Policy (assuming that the funding formula remains unchanged or continues to favour 4* outcomes). The deeper paradox is that interdisciplinary, applied areas, such as Social Work and Social Policy (or Education for that matter) that use the theories and methods developed in their associated core disciplines, achieve better outcomes than those disciplines.
But it is not simply that high-ranking (for us, highly-rated) Sociology UofAs will receive less funding. They will also share in the reputational consequences for Sociology in general, as a relatively low-scoring subject area. So the top two Social Work and Social Policy Submissions (using the criterion of proportion of 4* scores) are in the top 5 of all submissions (along with two Economics Departments), while Manchester Sociology (1st in Sociology), for example, is 245th overall, further indicating a problem at Main Panel C.
HEFCE’s argument that there are no Sub-Panel variations and that grade inflation reflects changes in international standing is wrong once we disaggregate 4* from 3* and look more deeply at distributions and types of institutions submitting to each panel. If it goes unchallenged it will reinforce the impression of Sociology as a discipline of low international standing relative to other subject areas. This is significant, especially where the data derived from the REF are used increasingly by universities to manage ‘performance’, both in the aggregate and individually.
We do not believe that it is possible to achieve ‘calibrated’ measures of scoring items across sub-Panels without supporting data and a process to check judgments at the end of the process as well as at the beginning. We do not believe that the issue is simply that the Sociology sub-Panel marked ‘too harshly’, since that depends upon knowing how other Sub-Panels marked in practice and that is only available at the end of the process. It is evident that the Sociology Sub-Panel (and that of Anthropology and Development Studies) produced outcomes that may be characterized as a form of strong ‘egalitarian solidarity’. However, the form that ‘solidarity’ took was problematic and it was up to the Main Panel to secure comparability of results. It is worrying that it did not notice the very evident disparities. We emphasise that this is the role the system of main panels was supposed to provide. Our colleagues (and friends) on REF sub-panels clearly worked hard to try to deliver judgements against the relevant criteria, but they did so without direct access to the scoring ‘styles’ of other sub-panels.
In our opinion, the REF results reflect scoring differences that it was the role of the Main Panel to resolve. This, after all was David Eastwood’s promise when Chief Executive of HEFCE, “I believe we can build a system in which action replays will generally show the REF got it right.” Our action replay suggests that the HEFCE ‘ref’ should have raised the flag! We might cite W.I Thomas that what people believe to be real is real in its consequences. The REF 2014 outcomes are not ‘facts’ about the relative standing of the social sciences, but artifacts of a flawed exercise. The problem for sociology as a discipline arises if they are unchallenged in an increasingly instrumentalised academy that denies alternatives, one where, “… The simple facts of life are such/ they cannot be removed.”
CHORUS: “… As REF2014 goes by”
Stephen McKay is Distinguished Professor in Social Research in the College of Social Science at the University of Lincoln.
Originally posted 26th January 2015