With a push to make hospitals and doctors more accountable for health care quality, more attention must be paid to the accuracy and reliability of measures used to evaluate caregivers, says a prominent Johns Hopkins patient safety expert.
Writing in the April issue of the journal Health Affairs, Peter J. Pronovost, M.D., Ph.D., a professor of anesthesiology and critical care medicine at the Johns Hopkins University School of Medicine, argues that as the desire to evaluate and improve health care intensifies, there remains little consensus as to which measures are scientifically valid and accurate assessments of quality. This risks misinforming patients who may make decisions based on metrics that poorly reflect the state of health care provided by hospitals and may ultimately lead to a failure to make improvements in hospitals where quality is judged to be better than it is, he says.
“There is bipartisan support behind efforts to start paying for value rather than volume,” Pronovost says. “This is great, but we act as if there’s a whole library of reliable outcome measures for us to use and the fact is that serious work needs to be done to create them. We can’t shrink from doing this science. We need to be guided by it.”
Acknowledging that substantial shortcomings in the quality of care persist, causing needless patient harm and increasing health care costs, Pronovost says fixes can’t be put in place until rigorous scientific data show exactly where systems are broken and until hard comparative evidence points to what types of repairs work best.
In the absence of such safety and efficacy science, he says, there will remain little consensus among hospitals and physicians about the best methods to judge quality or improvement. For example, he notes that overall hospital death rates are an imperfect reflection of quality of care, but in some cases they are the only measures used.
Pronovost, writing with Richard Lilford, Ph.D., an epidemiologist at University of Birmingham in England, points also to research that compared four different measurement services used to assess the same data from the same hospitals to determine in-hospital mortality. Forty-three percent of hospitals that showed higher-than-expected mortality by one commercially available metric showed lower-than-expected mortality by another.
Pronovost and Lilford call for the creation of an independent agency, the equivalent of a Securities and Exchange Commission for health care, to create rational and standardized outcome measures similar to the accounting rules the SEC creates for businesses. “The goal is to make the process of determining quality standard and transparent, and make data meaningful for consumers and usable by clinicians, ultimately improving patient outcomes,” Pronovost says.
Doctors support the use of outcome measures if they are valid and reliable enough to enable conclusions to be drawn about the quality of care, Pronovost says. Too often, he says, they aren’t. Hospitals, he notes, once were being fined for hospital bloodstream infections after government regulators screened billing claims for codes signifying infections to calculate infection rates. “That measurement gets it right only one in four times — 25 percent of the time,” he says. “Clinicians have never used that data because they thought that it was useless, because it was useless.”
Now, government regulators make judgments based on Centers for Disease Control and Prevention data, which include lab tests, temperature readings and other signs and symptoms of infection — far more accurate measures, he says.
Meanwhile, Pronovost says some states penalize institutions for what they deem are preventable complications contracted by patients during their hospital stays. But, he says, hospitals don’t know exactly what they are being judged on because those states use a proprietary algorithm created by a private company to determine which hospitals are “successful” and which ones should be sanctioned. Clinicians and the public neither know how accurate the measures are nor how they were calculated.
“The process should be transparent and reproducible; instead it’s a black box,” he says. “We don’t know if it gets it 5 percent right or 95 percent. We ought to know how imperfect it is.”
In a second article also published in Health Affairs, Pronovost, along with Jill A. Marsteller, Ph.D., of the Johns Hopkins Bloomberg School of Public Health, and Christine A. Goeschel, Sc.D., R.N., M.P.A., M.P.S., in the Department of Anesthesiology and Critical Care Medicine, present a case study of a success story in measuring outcomes: central line-associated bloodstream infections.
The group outlines the Hopkins-led effort to virtually eliminate bloodstream infections in intensive-care units throughout Michigan, which is now in nearly every state and has reduced the number of deaths. Infection rates have fallen dramatically in ICUs where Pronovost’s cockpit-like checklist and culture-of-safety program were implemented. This is one of the few, perhaps only, national success stories documenting measurable improvements in patient outcomes. The Hopkins program is now being implemented across the United States. Data from the first 22 states that have participated in the program for over a year suggests infections have been reduced by approximately 40 percent.
“There are precious few outcome measures deemed valid by clinicians,” he says. “This is one of them.”
Stephanie Desmon; 410-955-8665; [email protected]