Research analysis

Therapist effectiveness: implications for accountability and patient care.

Kraus D.R., Castonguay L., Boswell J.F. et al.
Psychotherapy Research: 2011, 21(3), p. 267–276.
1 in 6 US therapists (mainly not specialising in substance use) typically ended up with clients whose substance use problems were significantly worse than when they started therapy, an indication perhaps that social workers and mental health counsellors find these issues especially hard to deal with.

Summary Some counsellors and therapists achieve on average outstanding results, while others leave many patients worse than when therapy started. While this is known to happen, how great this variation is in normal practice and what proportion of therapists fit in these categories is unclear. Across all types of patients including those being treated for substance use, the featured study is the first to assess the pervasiveness of positive versus harmful therapist effects in normal practice. It also asks whether therapists tend to be good/bad across the board, or have strengths or weaknesses with respect to some types of problems but not others.

The data for this study came from clinicians or clinics who had contracted an outcomes management company to process assessment and outcome data from patients as a way of monitoring their performance. From this dataset were selected records on adult outpatients which included a standard pre-therapy assessment of wellbeing and functioning in 12 domains, Work functioning, sexual functioning, social conflict, depression, panic (somatic anxiety), psychosis, suicidal ideation, violence, mania, sleep, substance abuse, and quality of life. plus a repeat assessment near the sixteenth week of treatment, by when most improvements will normally have become apparent. Of these 15,217 patients seen by 3222 therapists, the sample was further limited to therapists with at least 10 patients and to just the first 10 patients. The final dataset included 6960 patients and 696 therapists, the latter being quite similar to the larger group of 3222 clinicians. Both patient and therapists were mainly female and the therapists were primarily social workers and mental health counsellors. Just 5% were licensed drug and alcohol counsellors.

For each patient it was calculated whether in each of the 12 domains they had reliably improved (ie, more than could be attributed to assessment error), reliably deteriorated, or were somewhere in between, neither definitely improved nor definitely worse. These patient progress assessments were then used to assess the therapists. In each domain, 'effective' therapists were defined as those whose average patient reliably improved on that measure, 'harmful' therapists as those whose average patient reliably deteriorated. In between ('unclassifiable/ineffective') were those whose patients on average neither improved nor deteriorated or who had too few patients with such problems for an assessment to be made.

Main findings

Of the 696 therapists, the proportion assessed as effective (ie, their average patient reliably improved) ranged from a low of 29% in treating sexual dysfunction to a high of 67% in treating symptoms of depression. Exactly half were effective in treating substance abuse. In contrast, at 16% substance abuse (along with violence) topped the ranking of the proportion of therapists assessed as harmful. Bottom of the range at 3% was treating depression. In the treatment of substance abuse, therapists overall achieved on average a medium degree of improvement in their patients (an effect size A standard way of expressing the magnitude of a difference (eg, between outcomes in control and intervention groups) applicable to most quantitative data. Enables different measures taken in different studies to be compared or (in meta-analyses) combined. Based on expressing the difference in the average outcomes between control and experimental groups as a proportion of how much the outcome varies across both groups. The most common statistic used to quantify this difference is called Cohen's d. Conventionally this is considered to indicate a small effect when no greater than 0.2, a medium effect when around 0.5, and a large effect when at least 0.8. of 0.47), but effective therapists on average recorded a very large positive effect (effect size of 1.14) and harmful therapists a large negative effect (effect size of -0.98).

The next question addressed was whether therapists who were effective (or the reverse) in one domain also tended to be the same in others. Generally this was only modestly the case; often therapists excelled at relieving one type of problem but failed with others. With respect to substance abuse, the correlation between how high a therapist ranked in this domain and how they ranked in others ranged from near zero up to (for suicidal ideation) a modest 0.24, including just 0.11 for quality of life and 0.10 for work functioning.

The authors' conclusions

On average, the findings from this study suggest that therapists are quite effective, but these global findings mask tremendous variability in therapist performance and in the symptom types they effectively address. In particular, the data indicate that harmful therapists are more widespread than previously thought. Depending on the symptom type, the average patient of 11% to 38% of therapists ended initial treatment worse off (but some within the margin of measurement error) than when they started, including 20% whose average patients left more suicidal and 36% more violent. On the more stringent criterion of reliable deterioration beyond the margin for error, again depending on the symptom type (substance abuse and violence topped the list), the average patient of up to 16% of therapists ended initial treatment significantly worse, justifying the label 'harmful' in these cases.

The study also found preliminary evidence that therapist effectiveness is not a global construct; therapists skilled in one domain may be harmful in another. Just from 1–9% of the variation in how therapists rank in each domain can be accounted for by their global competence. The bulk of the variation between therapists is symptom-specific. No therapist in this study was found effective in every clinical domain.

These findings can be set against the common finding that therapists overestimate their performance. The contrast suggests that standards of ethical practice may require therapists routinely to measure their outcomes and focus their practices where they are most likely to succeed, or obtain supervision or continuing education to improve in weaker areas. From the patient's point of view, an ideal system would enable them to find appropriate therapists not just in terms of gender, ethnicity or other currently used variables, but also their track record of helping patients with similar issues.

The findings also have important training implications, including the provision of regular and systematic feedback to students and trainees about their impact on different aspects of their clients' functioning. Solid evidence that clients tend to get worse on specific aspects of their functioning should prompt the trainee and their supervisor to consider remedial strategies.

The importance of these implications mandates an awareness of the limitations of the study on which they were based. Notably, rather than being based on a random national sample of therapists, the contributors were a convenience sample who were concerned enough (or whose employers were) about being aware of their performance to pay for their client outcome data to routinely be analysed.

Findings logo commentary Importantly this study found, not that a high proportion of therapists were globally harmful, but that a substantial minority had patients who got worse in some areas of their wellbeing or functioning. Though the featured study was unable to pinpoint what made some therapists counterproductive, this issue has been addressed by experts including some of the authors of the featured study. Among the candidates are inflexible application of guidelines and techniques, inappropriate use of techniques which arouse anxiety or resistance, lack of awareness of when things are going wrong and of insight in to the causes, inadequate familiarisation with the client's strengths and vulnerabilities, and failure to establish a solid therapeutic relationship. Given the therapist's injunction to above all do no harm, the training implications are profound, and given findings that some therapists are simply unsuitable by nature, so too are the implications for staff recruitment and retention. As the authors acknowledge, such implications mandate a thorough probe of whether the findings can be relied on. While there are the concerns detailed below, these do not undermine the study as an indicative if not definitive assessment of the extent of harmful practice, nor the implications the authors draw for therapist selection and training.

At 10% to 15%, previous estimates of deterioration among clients seeking help with substance use problems exceeded the 3% to 10% range reported for psychotherapy in general. In the featured study, deterioration in substance use problems was one of the two most common ways therapists seemed to harm their patients, a finding which may reflect lack of training and difficulties faced by some generic mental health and social work practitioners in addressing substance use. Non-specialist workers vary in effectiveness even in the very brief encounters characterising alcohol or drug interventions with people not seeking help at all, but identified through screening in general medical services or by other methods (see for example: 1 2 3).

It is also well established that specialist substance use counsellors and therapists differ in effectiveness. Some of the reasons for these differences have been explored in Findings' Manners Matter series, devoted to the importance of sensitivity, helpfulness, and the systematic implementation of a personal, welcoming response. Since those reviews the most wide-ranging investigation ever of the organisational health of British treatment services has found that staff working in an atmosphere of support, respect, and concern for their development, tended to have clients who also felt understood, respected, supported and helped – a finding which supports the possibility that the featured study might partly reflect organisational variation, not just differences between individual therapists.

Also from the UK, another study has thrown up the intriguing possibility that non-conformist drug workers who value hedonism and stimulation help marginalised problem drug users most because their values match those of their clients. In line with the featured study's finding that therapists are often good with some problems but not others, it seems likely that such workers, while doing well with drug use problems and clients, would not so readily help more socially conventional mental health clients.

Do the findings stand up to scrutiny?

The representativeness of the samples of patients and therapists is the major limitation of the study as a barometer of the national US picture. Only the records of patients who stayed in treatment for around 16 weeks were included in the analysis, the results of which might conceivably have differed if early drop-outs had been included. The major way in which the 3222 therapists in the starting sample were whittled down to 696 was the elimination of those with fewer than ten recorded patients. Though the retained sample of therapists seemed generally typical of the full sample, it seems likely that there were some systematic differences which meant they saw more patients, retained them for long enough for them to be included in the analysis, or were more diligent in documenting these patients than the other nearly four fifths of therapists not retained in the analysis. But these influences seem most likely to have resulted in an under- rather than an over-estimation of harmful practice.

The fact that deterioration was not uniform across all symptom groupings raises the question of the importance of these issues to the patient. There is no indication in the study of which were the problems which led the patient to seek help and/or constituted their primary diagnosis, or which were less focal and severe issues on which some deterioration might be considered a price worth paying. Similarly, it is not known whether the problem areas therapists failed on were those they were employed and/or aiming to deal with. It would, for example, be of great concern if the patients of a substance misuse counsellor – presumably seeking and expecting help with severe substance use problems – generally got even worse in their substance use problem scores, but perhaps less of a concern if this happened with depressed people seeking mental health counselling and whose substance use, though more of a problem than before, remained unremarkable, especially if at the same time their core concern had been effectively treated. The caveat that rather than therapy causing deterioration, some of the patients might have got worse (and perhaps more so) even without therapy does not explain why deterioration in some aspects of their patients' welfare was characteristic of some therapists. It can also be countered by the speculation that some clients who improved might have done even better without therapy.

Other limitations are that the categorisation of therapists was made on the basis of just ten patients each, though in many cases this will have been most or all their relevant recorded caseload. It is also unclear whether the findings reflected variation between therapists, or variation between clinics or whole service-provider organisations. To some degree they may bear witness to the impact of excellent versus poor management and therapeutic environments rather than excellent versus poor therapists. The 'outcomes' assessed by the study were in-treatment progress rather than sustained post-treatment recovery.

Another concern is that the lead author is or was associated with a company which stands to benefit from the monitoring implications of the findings. However, similar implications have been drawn (1 2) by independent academics and therapists. The implication that the performance of psychotherapists, and with it the welfare of their patients, would benefit from in-treatment feedback on how the client is doing and on their relationship with the therapist has been confirmed in studies which have randomly allocated patients to feedback-based versus non-feedback-based therapy. Benefits were most apparent in preventing patients doing poorly going on to end up significantly worse than when they started – in the featured study's terms, also preventing their therapists from falling in to the "harmful" category.

Last revised 19 October 2011

