Quality concerns with routine alcohol screening in VA clinical settings.

Bradley K.A., Lapham G.T., Hawkins E.J. et al.
Journal of General Internal Medicine: 2011, 26(3), p. 299–306.
In the US health care service for ex-military personnel, 61% of patients who screened positive when sent a postal survey did not do so when the same questions were asked by their clinics, casting doubt on the validity of the test in routine practice in a service where the emphasis was more on the quantity than the quality of screening.

Summary The US 'VA' health care service for ex-military personnel implemented routine screening for risky drinking in 2004, and since 2006 has required that the Alcohol Use Disorders Identification Test-Consumption Questions In the UK the questions are:
How often do you have a drink containing alcohol?
How many units of alcohol do you drink on a typical day when you are drinking?
How often have you had [6 or more units if female; 8 or more if male] on a single occasion in the last year?
(AUDIT-C) be used for screening. How this is done is left to individual facilities or networks (eg, at triage, by primary care providers, paper questionnaires, etc). Over 90% of VA outpatients nationwide are screened with the AUDIT-C.

AUDIT-C has been validated when administered by interviewers and when completed on mailed questionnaires and the results shared with primary care providers, but little is known about its performance when implemented in routine clinical care. The featured study aimed to evaluate the quality of screening in the VA from 2006–2008 by taking advantage of the overlap between data from randomly sampled outpatients including their AUDIT-C screening results from their clinics, and that provided by patients who after an outpatient visit responded (55% did) to a confidential patient satisfaction survey which included the same questions.

Altogether 6861 patients (17% women) had AUDIT-C scores from both sources not more than 90 days apart, making it possible to see if they corresponded or were 'discordant', meaning that one screen indicated alcohol 'misuse' Defined for the study as scores of five or more out of 12, the threshold for offering counselling in the VA system from October 2007. but the other did not. Incidentally, the study was also able (among other things) to assess whether a new version of the computerised prompt which most services used to promote screening made any difference to its accuracy; from January 2008, this prompted providers to ask the screening questions exactly as written, non-judgmentally, and in a private setting.

Main findings

Key finding was that 61% of patients who screened positive when sent a postal survey did not do so when the same questions were asked as part of their routine care at their VA clinics, part of a pattern of results which indicated that patients were more likely to acknowledge heavy drinking when not in front of their doctors or nurses and when assured of confidentiality.

Discordance rates among patients screening positive

Though just 8% of patients had discordant scores in the two AUDIT-C sources, this was because the majority (87%) scored below the alcohol misuse threshold in both. Altogether 765 patients gave answers indicative of alcohol misuse in the confidential survey but just 390 when questioned at their VA clinics. Of the 765 confidential-screen alcohol misusers, 468 gave sufficiently different answers at their clinics to score below the misuse threshold. Assuming their confidential scores were accurate, this means clinic screening would have failed to identify 61% as possibly in need of counselling. In contrast, over three quarters (297 of 390) of the patients who answered as misusers at their clinics also did so on the confidential surveys charts.

Another indication that the survey recorded more drinking than was acknowledged at the clinics was that over a fifth (22%) of the patients who told their clinics that they had not drunk at all in the past year said they had drunk in the confidential survey. Again, at 9% the reverse (survey non-drinkers telling their clinics that they had drunk) was much less common.

The pattern of discordance from the clinic screen results looked like the result of random or normal fluctuations, but discordance from survey results was much more common among the heavier drinkers, taking the form of their not scoring as misusers at their clinics. Even among the heaviest drinkers according to their survey answers, 47% did not score as misusers at all at their clinics, meaning that each must have given answers there which resulted in AUDIT-C scores at least four and as many as eight (out of 12) points lower.

In some local VA networks all the patients who scored as misusers in the confidential survey did not do so at their clinics, ranging down to 43%. The revised prompt from January 2008 for clinicians to ask screening questions verbatim, non-judgmentally, and in private did not result in any statistically significant improvements in concordance with survey results. When all the factors available to the study had been taken in to account, clinic results differing from survey results was significantly more likely Not influential were age, gender, socioeconomic variables mental health diagnoses. among patients who scored as misusers on the survey, were black, or were patients at some local VA networks rather than others. Not significant was whether the survey had come before or after clinic screening, suggesting that 'natural' resolution of heavy drinking could not account for the findings.

The authors' conclusions

Results indicate that validated questionnaires like AUDIT-C do not of themselves ensure the quality of clinical screening; quality should still be monitored. The VA's targets for rates of alcohol screening create incentives for documenting screening results, but do not provide incentives for high quality screening that identifies patients in need of advice. Setting very high target rates could contribute to lower quality by encouraging providers to document screening when they did not have the time to ask screening questions verbatim and in private. Future research must address the need for performance measures which not only incentivise providers to screen, but also to identify alcohol misuse.

In particular, 61% of patients who screened positive for alcohol misuse on a mailed survey screened negative when screened clinically, despite use of the same validated screening questionnaire. Some discordance is to be expected due to differences in settings and methods and normal fluctuations. However, the observed discordance cannot be accounted for by these factors.

Desire to present oneself in a good light socially, avoid perceived stigma, and/or to avoid discussing drinking with one's health carers, probably accounted for some of the discordance in the form of minimising one's drinking. Over twice as many patients who screened positive in a confidential mailed survey had discordant results compared to patients with positive clinical screens (61% v. 24%). If the latter is an indication of the 'expected' discordance rate, then the extra 37% discordance is an indication of how many more patients present themselves as moderate or non-drinkers when facing their clinicians or when they know clinicians will see their answers.

That black patients were more likely than other groups to present 'better' at the clinic than in the survey might reflect greater social desirability bias among these patients, although it could also reflect bias due to differences in the way AUDIT-C was interpreted and/or administered across racial/ethnic subgroups. Variation in discordance across VA networks suggests that institutional factors also played their part. Anecdotally, there is considerable variability across sites in the privacy of screening. Differences in training and/or decisions about who conducts screening might also contribute to variability. Patients who returned the VA's outpatient satisfaction survey and could be included in the featured study were older and drank less than non-respondents. Since discordance was greatest among alcohol misusers, this means that across all patients the proportion of misusers not identified by clinic screening may be even greater than the 61% recorded in this study.

Findings logo commentary As reviewers have observed, research has largely evaluated screening in highly controlled studies when research staff administer and score the evaluated screening test and the test against which it is being benchmarked. In contrast, we know comparatively little about screening in real-world clinical settings. This study takes us further down that road, and implies that previous studies may have been falsely reassuring about the performance of screening tests in routine practice, in particular how well they identify risky drinkers when their answers could (or are thought to) have consequences for the respondent which they wish to avoid. Regardless of which was the more accurate set of answers to AUDIT-C, the results remind us that the context in which responses are made can have a great influence and should be borne in mind when devising screening strategies.

The social desirability of minimising one's drinking in a medical context seems a plausible explanation for the findings, along with the desire to avoid further treatment or referral to substance use services. This is likely to have been potentiated or aggravated by the lack of privacy in some clinics to which the authors refer, which may have contributed to the substantial inter-clinic variation; clearly questionnaires completed on paper and in private may predispose to responses biased differently to those asked out loud in a public reception area. How much difference consistent assurance of privacy would have made is unclear; there may still have been a residual desirability bias.

The study was conducted in a health care system which in US terms comes close to the principles of the British national health service, with the notable exception that it is for ex-military, meaning also that it sees few women, though for this study they were over-sampled to increase the numbers.

What do we really know about screening?

Studies in which the screening test being evaluated and the benchmark (and usually more detailed) test are both completed in similar circumstances (eg, both in a surgery and seen by clinicians, administered anonymously, or in confidence by researchers) cannot tell how many risky drinkers would be missed by the evaluated test in routine practice. For instance, a US study using a confidential survey found single questions about either how often or how much someone had drunk correctly placed over 80% of drinkers in problem and non-problem categories – seemingly reassuring, but only when the questions are being asked in a research context where the answers carry no or minimal consequences for the respondents. The high chance that otherwise answers will be at least partly self-serving is why researchers have have developed screening instruments which are not obviously about drinking or drug use.

A review of the AUDIT test published in 2007 which included mainly primary care studies also raised the issue of the validity of a test whose results are "entirely a function of the respondent's ability and willingness to provide accurate information on his or her use of alcohol and its effects". The reviewers' doubts had been raised by a study of transport sector employees attending a company health service for a routine health examination, in the course of which they were offered an alcohol screening test. Together with blood tests, 22% of patients screened as risky drinkers on one or other test, but AUDIT alone would have identified only half these.

The English SIPS trial of screening and brief intervention in primary care took in only patients who had already screened positive in the short tests being evaluated and compared these with results from the full AUDIT questionnaire. That there was good agreement between the measures cannot in these circumstances tell us how many patients were missed because they did not initially screen positive.

Another study conducted in Welsh primary care practices did not suffer from the same limitation, yet still found the AUDIT questionnaire an accurate way to identify hazardous drinkers. This study did sample patients who initially scored negative on an AUDIT administered by research nurses but scored by practice nurses, and tested whether these results matched those from a more extended assessment conducted by researchers. In all 69% of patients found by the extended assessment to be drinking in a hazardous manner had also scored this way on the AUDIT, suggesting that even when practice nurses see the results, the test is moderately good at identifying drinkers who might benefit from advice. However, in this study under a fifth (18%) of AUDIT-negative patients invited to join the study did so, leaving the AUDIT unvalidated on over four fifths of patients who answered its questions in a way which indicated they had no appreciable drinking problems.

VA brief intervention studies

Also in the Effectiveness Bank is a test of using computerised prompts to remind VA clinicians to counsel patients who screen positive for risky drinking. Although this found low counselling rates and no improvement in drinking where the reminders were implemented, another implementation of the same system in VA clinics did find high counselling rates and some indication of drinking reductions. Early results from this study were reported in a review of performance measurement options for VA alcohol screening and brief intervention systems. Also available is an overview of issues and findings in respect of implementation of similar systems in the VA network nationally. In the Effectiveness Bank too are a review of what impedes or promotes the implementation of brief alcohol interventions by the VA research team, and another conducted for Britain's National Institute for Health and Clinical Excellence. The latter analysis includes extended commentary on the UK situation, partially replicated in a 'hot topic' entry discussing whether brief alcohol interventions really can deliver population-wide health gains.

Thanks for their comments on this entry in draft to Robert Patton of the National Addiction Centre in London, England. Commentators bear no responsibility for the text including the interpretations and any remaining errors.

Last revised 07 February 2013. First uploaded 03 February 2013

