March 15, 2014- By Steven E. Greer, MD
In 2009, Atul Gawande, MD, MPH and his large international team published in the New England Journal of Medicine (NEJM) an observational study that showed a significant reduction of death and “complications” after non-cardiac surgery. The World Health Organization (WHO) created the checklist used in the NEJM paper. After this non-randomized, non-controlled, observational study was published, entire nations adopted the surgical checklist system.
Now, in 2014, a population study drawing from Ontario surgical patient data, published in the NEJM, showed no significant benefit from the widespread adoption of the same WHO surgical safety checklist that Dr. Gawande popularized. This study was also observational, but it was stronger than the 2009 Gawande study in that it included the entire population within a region.
What went wrong? Are surgical checklists in the real world ineffective, or was the Ontario study flawed?
Dr. Gawande posted his thoughts about the recent study in an online blog. Critical of the Ontario study, he wrote, “So what to make of the Ontario finding that three months after government-mandated adoption the drop in mortality rates failed to achieve significance? Well, I don’t honestly know. I wish the Ontario study were better. But it’s very hard to conclude anything from it.”.
Dr. Gawande claims that the Ontario study power was inadequate. “For one, it was underpowered. They measured a 9% reduction in deaths (from 0.71% to 0.65%). But with only three months of data and lots of cases that have virtually no mortality (20% were ophthalmology, for instance), the study didn’t have sufficient sample size to tell if this reduction was significant or not. (The p value for the rate difference was 0.07 in the paper—see Table 2—so it was trending toward significant.)”.
The Ontario authors refute that, stating, “Our inability to replicate these large effects cannot be explained by inadequate power. Our study included more than 200,000 surgical procedures in 101 hospitals. Ontario hospitals implemented surgical checklists between June 2008 and September 2010.”.
Dr. Gawande also criticizes the Ontario study by writing, “Second, measuring results just three months after a government mandate and a weak implementation program—involving no team training, local adaptation of the checklist, or tracking of adoption—would mean that many, many surgical teams were simply not using the checklist.”. He actually makes the case against checklists, because in his own WHO study, published in 2009, his checklists were not implemented in isolation. They were part of an overall comprehensive surgical training program, which confounds the results he reported even more.
But neither the Ontario authors nor Dr. Gawande hit upon the biggest flaws of the studies that found significant reductions in adverse events by using the checklist. The data collected is simply unreliable. Studies like it, conducted in many different small countries, have been shown to be fraudulent due to systemic corruption.
In the 2009 Gawande study, they collected data from eight countries (India, Jordan, New Zealand, Philippines, Tanzania, England, and The United States). It is now known that India and Third World countries are magnets for corruption in clinical trials. The outcomes measured, such as surgical complications and death, are so rare, that all it would have taken was for one bad principal investigator defrauding the trial system to tip scales into statistical significance.
Assuming the Gawande WHO trial were conducted solely in the U.S., it failed to blind the doctors in trial, who were the same ones adjudicating the “complications”. It is basic clinical trial “101”, so to speak, to know that doctors, by human nature, want their trials to turn out well.
Complicating all forms of hospital safety studies is the fact that some human somewhere has to be the one to record the patient temperature for “fever”, or adjudicate a bacterial count in the hospital lab to diagnose a blood sample as “infected”, and so on. Some doctor or nurse has to diagnose a deep vein thrombosis or renal failure (components of the official “complication” list), which are subjective calls.
When an entire ICU floor, for example, is highly motivated and rewarded for having no infections, then there is extreme pressure to simply not collect data that reveals adverse outcomes. This happens everywhere. Police departments have been caught reporting artificially low crime reports because their careers are based on lowering crime.
The 2014 NEJM by the Ontario team has a more robust design than the Gawande observational paper, but there still has never been a properly controlled, randomized, double-blinded, clinical trial studying hospital safety checklists. The question is not whether a surgical team, slowed down and forced to be prudent by a checklist, would be safer. The question is whether that is pie-in-the-sky dreaming.
In the real world, surgeons think they are infallible, and are also pressured by hospitals to be fast in the OR. In the real world, surgeons associate “checklists” with one man, Atul Gawande, and petty emotions of jealousy or resentment are evoked, triggering passive aggressive resistance to checklists.
In the real world, checklists might be as feasible as phasing out fee-for-service. In other words, surgical checklists might not be worth the paper they are printed on.
Questionable comments by Atul Gawande after the Boston Marathon bombings
Peter Pronovost, MD, PhD: The medical checklist concept to reduce adverse events