Measuring the right things
Should we adjust the data to make it look fairer?
In the previous section I suggested that Ofsted was right to downplay the use of CVA data in coming to a judgment about how well pupils in a school perform. There is evidence to show that CVA can overcompensate for multiple factors and suggest that a school has very high value-added in relation to the perceived ’disadvantage’ of its pupils. I don’t know why this happens because I am not a statistician, but I can’t imagine for a minute that a simple mathematical formula could account for how multiple disadvantage factors could inter-relate under all circumstances.
It seems to me to be a good principle that we should always see the raw data first. If we wish to place some ‘adjustment’ on the raw data to compensate for an external effect then let us see this in addition so we can weigh it up for ourselves. We seem to have put aside our sense of estimation here in favour of a magic formula. It was magic because with the introduction of CVA so many schools suddenly looked much better than previously without apparently having done anything different.
CVA was a worthy attempt to recognise the work of schools working with high numbers of pupils “for whom success is harder to achieve” – to quote Dr. John Dunford on this subject. However, CVA was flawed both conceptually and mathematically and it is right that inspectors will place less emphasis on it when evaluating the work of schools. A study by Bristol University concluded that the CVA formula was “misleading, and at worst dishonest”.
In our work with Local Authorties, we have seen how the CVA formula often significantly overcompensated where there were multiple factors operating. At the time when CVA use was at its peek there was a very high correlation reported in the TES between the CVA scores of a school and the judgement on Leadership. Put simply, many headteachers in schools whose pupils would have done a lot better if they went to other schools in the area, received inflated judgements about their leadership. The problem here was an over-reliance on one piece of data.
So how should CVA measures be used?
The CVA score for a school is an indicator of the extent to which a school has groups of pupils for whom success will be harder to achieve. If we examine the groups which make up this calculation we will know which groups of pupils come from an area of high deprivation, which are from ethnic categories that at a national aggregate level do less well than other groups, and so on. These groups will constitute the ‘specialism’ for the school in the sense that the school will need to have developed approaches for ensuring that these pupils have every opportunity to succeed in spite of their inherent disadvantage. CVA now becomes very useful in pointing out where to look to see if the school is making this specialist provision and then judging how effective it is. Previously, CVA was being used as some sort of ‘excuse’ to say to a school, ‘because you have disadvantaged groups we will raise your scores so we can compare you with schools that don’t have these groups’. This was not right, because it suggests that doing nothing to help these groups is acceptable. Now CVA becomes a challenge to schools to prove how they are specialising in assuring that these groups do not underachieve as national aggregated data suggests that they might. This is the right approach because there are many examples of schools which do bring success to their pupils, despite the presence of these background factors which may have made their journey more difficult. These schools provide the yardstick by which other schools should be measured. In fact, these schools should be the ones that appear at the very top of any league table of good schools – ahead of schools in leafy middle class areas that may have far less to do to bring success to their pupils.
The TES of 13.08.10 reports that ‘ASCL wants the Government to set up a group of statistical researchers and school leaders to find ways to improve the CVA measure’. The main aim of this group, I would surmise, will be to seek measures that Ofsted can use to compare unlike schools more fairly. The most useful thing that a knowledge of the profile of a school population can provide is a composite picture of its inherent ‘specialisms’ as well as a measure of the degree of challenge that a school will have in achieving success compared to any other school. The Value Added to some groups of learners will be harder fought for than others. It may be too that nationally, few schools may achieve comparable success with any particular group despite their best efforts. If this is the case then falling short should not be simplistically seen to be a criticism of the school or the quality of its teaching force. It may however, point to a need to ensure that such schools have the resources, specialist support and knowledge of best practice in order to take on such challenges with a higher expectation of success. Currently, as I write, there is a furore on the other side of the Atlantic where the US Government has fallen for the mistake of judging teachers (and even sacking some of them) on the basis of their student’s test scores. I would like to think that in the UK we are able to separate provision from outcome, and to judge the quality of each of them separately, whilst knowing that the relationship between the two is complex and certainly not a constant when comparing schools working in very different circumstances.
(Since writing this in late 2010, the trend in the UK has been to continue to follow the US model in several respects. For example the rise in Academies has paralleled the increase in Charter schools. Importantly, although there has been an increase in focus on the importance of teaching, its evaluation is sensibly based on an established range of criteria that places emphasis upon its impact on the progress made by pupils.)
Attainment - the current measure of a school’s success
The threshold figure of 5 A-C grades at GCSE including English and maths signifies the level of attainment that qualifies pupils to continue to Level 3 courses and higher education. Achieving this threshold level of attainment is equivalent to issuing a passport for access to the next stage of education. Not achieving this target means that pupils are more likely to fall into the category of not in education, employment or training (NEET). It is for this reason that Attainment levels form the primary judgement about the accomplishments of schools. Around 1 in 9 pupils currently fall into the category of NEET. Making up the shortfall in qualifications after year 11 has left school has a very low success rate, so it is really a one-off opportunity for most pupils. Although attainment is currently the main measure of a school’s success, many argue that this single measure insufficiently recognises the breadth of the work of schools. There is also a downside to this that is worthy of a section on its own. This monocular focus on attainment can add distortion to the system. This can be so marked in its effects that it can compromise the value of the educational process in some circumstances. It is such an important issue that it is worth returning to in the next section.
How can Achievement be recognised?
Beyond the judgment about the percentage of 5 A-C grades lies a range of considerations about the quality of provision in that school. Attainment is shown in the exam results and levels gained by pupils. Achievement is a less precise concept, but factors into the analysis of the standards achieved by a school, a consideration about whether the attainment of pupils was good in terms of progress made and the quality of school provision. Schools that get really good results in relation to the prior achievements and expectations of their pupils deserve particular recognition. But how will this work? In the 2009 evaluation schedule ’Achievement’ is said to be: ‘likely to be good when attainment is at least average and progress is good’. This isn’t a precise definition as such. Inspectors are asked to find out “how well pupils make progress relative to their starting points, using contextual value added and other value added measures, including whether there is any significant variation between groups of pupils (for example, minority ethnic groups, groups with different prior attainment, gender groups, gifted and talented groups, pupils speaking English as an additional language), making clear whether there is any underachievement generally or among particular groups who could be doing better“.
Achievement will therefore be a subjective decision arrived at by weighing up a range of evidence, including attainment, progress and variation, and using a knowledge of what other schools achieve in similar circumstances. Even with just a four point scale for grading the judgment on achievement, there is going to be plenty of scope for argument over which side of the Good/Satisfactory border the judgment should lie.
A range of evidence gives a bigger picture
The focus for a judgment on Achievement will be what happens in the classroom, supported by evidence of progress over time and the quality of experience offered to pupils during their time at school. Ofsted provides (in the 2009 schedule) very clear guidance on judging the quality of learning and progress. We note that part of the Ofsted evaluation involves making a judgment about any ’significant variation’ between groups of pupils or ‘particular groups who could be doing better’. This is difficult for an observer who doesn’t know the pupils well. There will be information on the attainment of broad categories in the FFT data and RAISEonline, but a spot classroom observation is only likely to confirm in a general sense whether the learning is personalised and effective for all pupils. We need to judge schools on a range of evidence. One thing we haven’t put much emphasis on up to now is ‘Variation’, even though we know that it is ‘Education’s Biggest Challenge’ – according to the National College. We can often see differences in pupil performance, but schools don’t yet use a common, systematic, analytical approach to finding out which factors impact on the learning of any particular group of pupils.
Variation is the new CVA
The success of all pupils ought to be the starting point for assessing how good a school is; but the other important factor is whether school provision is consistent for all groups of learners. Knowledge about variation is an important quality assurance issue. Schools need to know about variation just as Marks and Spencer needs to know that all its sausages are of good quality. The National College had a two year project on In-School Variation, and the TDA/NC produced an excellent paper called ‘Making effective practice standard practice’ (link here) which offers practical guidance on how schools can reduce negative variation.
Variation is the spread of results about an average. If a pupil gets A or B grades in all their subjects, but in one subject they get a D grade, then we would think that the child slipped up somehow. If we find that lots of pupils got a D grade in that same subject we might suspect that the teaching slipped up. Following up on variation in patterns of group performance is an emerging approach to school improvement, made possible by tools that provide more flexibility than crunching data in Excel.
Variation provides the basis for an investigative approach to raising standards. The principle here is that a pupil demonstrating A-C range attainment in one subject, has the potential with the right teaching of achieving in that range for any other of their subjects. A pupil attaining less than a ‘C ‘ grade in either English and maths, but ‘B’ or ‘C’ grades in other subjects, is falling short of hitting that important qualifying target for entry to level 3 courses and better future prospects. They have demonstrated that they can attain this success in other subjects, so it is fair to think they could do this too in English and mathematics with a bit of extra help.
The importance of this as an approach is that it recognises that there is variability to be found on both sides of the Teaching-Learning equation. Most schools’ approach to raising achievement is based on looking at the variation in pupil performance. They assume that ‘teaching’ is a constant in the equation. The obvious response to this is to ask them to first prove that teaching is a constant, i.e. that pupils will do as well no matter which teacher they have in any subject. The aim of a school’s CPD programme is, of course, to ensure that this is the case. But does it?
Managing the Performance Profile for the school
We can examine a ‘performance profile’ for the school by looking at the range of average grades achieved across a school. In examining this profile, there are two goals for schools.
One is to raise the curve for pupils whose grade total lies below the A-C threshold.
The other is to note how much the distribution of grades varies from one subject to the next. The distribution of grades gives a visual indicator of the relationship between teaching and learning. Low variation suggests a tight relationship between teaching and learning. High negative variation indicates that some groups of pupils are falling short in this subject compared to their performance in other subjects. High positive variation can indicate that the teaching is having the effect of ‘pulling up’ the achievements of some learners from their average. These are very generalised descriptions however. It is in explaining these distribution patterns, and the metrics that accompany their analysis, that is the key to recognising and improving group achievement.
The answers to school improvement don’t always lie in the big data sets
Schools that mostly leave the evaluation of their school’s performance to Ofsted are the ones that will be less sure about what an inspection report will say about the school. The closer that a school’s systems for self-evaluation align with those that Ofsted use, the more certain they will be about how they would be judged to be doing. But, Ofsted’s data on a school is based on a comparison with large, remote, aggregate data sets. RAISEonline data will form the main basis for an inspection report. In the previous format for inspections which spent limited time in school, the TES found a very close relationship between a school’s RAISEonline data and the main judgments in their Ofsted report. Now that inspectors will spend more time in school that should change. That extra time spent in school is due to be focused on seeing what happens in lessons, but it should also be spent looking at what the school believes about itself, and the extent to which it takes responsibility for its own standards.
Local contextual circumstances form the basis for research
Schools are sitting on the most important source of information about school effectiveness.
This is the ‘local contextual school information’ about pupils - as opposed to the broad, national comparative data which Ofsted uses. Local contextual information is what is known, or can be found out, about factors which influence the learning of groups or individual pupils. Furthermore, unlike the Ofsted categories (SEN, ethnicity, etc) which don’t necessarily change, local contextual circumstances can be influenced, once we know about their effects. This has been the basis for developing our toolkit for school leaders. Examples of local contextual circumstances are things like: pupils who attend the breakfast club, pupils who went on a vocational study visit, pupils from classes which do ‘Brain Gym’ exercises, pupils from families that have books on display at home, Summer born, left handers, pupils whose parents never attend parents’ evenings, pupils who went on the History Battlefields trip in Y10, pupils who attend breakfast club, etc. The tools that we have developed support teachers undertaking research into the influence of local contextual circumstances upon the performance of pupils by allowing such groups to be defined and their relative performance viewed across different subjects and classes.
Schools that promote their own teacher-led research into factors which influence teaching and learning, and monitor the impact of intervention, will be one step ahead of Ofsted in having high quality, up-to-date, performance information. Significantly, the hard data that Ofsted have on a school is mostly retrospective, and based on the previous year’s Y11 pupils. Ofsted doesn’t have evidence on the current year 11 performance except that which they collect during the brief time that they are in school. Yet evidence about current standards can be much more important than evidence about last year’s standards, especially in an improving school. A school that provides high quality standardised information about the current performance of Y11, especially about progress and variation, is producing evidence that will be influential on the judgments that inspectors will make about current standards and leadership.
I will next write some thoughts on measuring learning and how data systems can provide a window into that mechanism. But the problems caused by setting high targets for attainment, and the distortion that can result from this, is also an issue that needs to be returned to.