In this blog last week I explored the (rather flimsy) evidence base available to the developers of the original Troubled Families Programme (TFP) and the potential for “theory of change” approaches to provide useful insights in developing future policy. This week I return to the formal TFP evaluation and look at the lessons we can learn in terms of the timing and data quality issues involved.
The first secret of great evaluation: timing
The experience of the last Labour Government is very instructive here. New Labour appeared as strong advocates of evidence-based policy making, and in particular were committed to extensive use of policy evaluation. Evaluated pilots were completed across a wide range including policies relating to welfare, early years, employment, health and crime. This included summative evaluations of their outcomes and formative evaluations whilst the pilots were underway, attempting to answer the questions “Does this work?” and “How does this work best?”
Ian Sanderson provided a useful overview of Labour’s experience at the end of its first five years in power[i]. He found that one of the critical issues in producing great evaluations (as for great comedy), is timing. Particularly for complex and deep-rooted issues (such as troubled families), it can take a significant time for even the best programmes to have an impact. We now know the (median) time a family remained on the TFP programme was around 15 months.
It can also take significant time for projects to reach the “steady state” conditions, which they would work under when fully implemented. Testing whether there are significant effects can require long-term, in-depth analysis. This doesn’t fit well with the agenda of politicians or managers looking to learn quickly and sometimes to prove a point.
Nutley and Homel’s review[ii] of lessons from New Labour’s Crime Reduction Programme found that “projects generally ran for 12 months and they were just starting to get into their stride when the projects and their evaluations came to an end” (p.19).
In the case of the Troubled Families Programme, the programme started in April 2012, and most of the national data used in the evaluation relates to the 2013-14 financial year. Data on exclusions covered only those starting in the first three months of the programme, whereas data on offending, benefits and employment covered families starting in the first ten months of roll-out.
We know that 70% of the families were still part-way through their engagement with the TFP when their “outcomes” were counted, and around half were still engaged six months later.
It’s now accepted by DCLG that the formal evaluation was run too quickly and for too short a time. There just wasn’t time to demonstrate significant impacts on many outcomes.
The second secret: data quality
Another major element of effective evaluation is the availability of reliable data. Here the independent evaluation had an incredibly difficult job to do. The progress they have made is impressive – for the first time matching a wide range of national data sets, local intelligence and qualitative surveys. But at the end of the day the data quality base of the evaluation is in places poor.
The evaluation couldn’t access data on anti-social behaviour from national data sets, as this is not recorded by the police. This is unfortunate given that the strongest evidence on the effectiveness of TFP-like (Family Intervention) programmes in the past concerns reducing crime and anti-social behaviour[iii].
A chunk of data came from the 152 local authorities. This data was more up to date (October 2015), although only 56 of the councils provided data – which enabled matching to around one quarter of TFP families. The evaluation report acknowledges that this data was “of variable quality”. For example, the spread of academy schools without a duty to co-operate meant there are significant gaps in school attendance data. This will be a serious problem for future evaluations unless academies’ engagement with the wider public service system is assured.
In summary, the TFP evaluation covered too short a period and, despite heroic efforts by DCLG and the evaluators, was based on data of very variable quality and completeness.
Next time we will explore the “impact” evaluation in more detail – looking at how designing a more experimental approach into this and future programmes could yield more robust evaluation conclusions of what works where.
[i] Sanderson, Ian. “Evaluation, policy learning and evidence‐based policy making.” Public administration 80.1 (2002): 1-22.
[ii] Nutley, Sandra, and Peter Homel. “Delivering evidence-based policy and practice: Lessons from the implementation of the UK Crime Reduction Programme.” Evidence & Policy: A Journal of Research, Debate and Practice 2.1 (2006): 5-26.
[iii] DfE, “Monitoring and evaluation of family intervention services and projects between February 2007 and March 2011”, 2011, available at: https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/184031/DFE-RR174.pdf
Jason Lowther is a senior fellow at INLOGOV. His research focuses on public service reform and the use of “evidence” by public agencies. Previously he led Birmingham City Council’s corporate strategy function, worked for the Audit Commission as national value for money lead, for HSBC in credit and risk management, and for the Metropolitan Police as an internal management consultant. He tweets as @jasonlowther