Resilience, Part Eleven (Questions)

“We, as a community, need to identify the core values of our field.”

The talk’s theme was health care, and it had begun with a history of medical error.  Human error, it turns out, is a relatively recent phenomena in medicine.  For years, the presenter advised, the term used had been ‘risk’, and risk had been framed as the price we pay for medical progess.  In addition, there was a shared understanding that the dangers of new methods must be accepted, that benefits and risks are inseparable, with loss rates remarkably low.  These views had persisted even after the 1974 publication of Ivan Illich’s Limits of Medicine*, when the author’s comparison of medical morbidity rates to those of traffic and industrial accidents were considered an attack on the field. But slowly, beginning in the late 1980s, this belief began to shift.

The shift, the ER physician and researcher explained, had begun for several reasons.  The first, of course, was the desire of the profession to improve.  Another was the annexation of processes from other domains: TQM from business, CRM from aviation, root cause analysis methods from engineering and design.  But, the presenter theorized, one of the main drivers of the shift from ‘risk’ to ‘error’ was the increasing industrialization of medicine, and the resulting shift from physician-led hospitals, seen as a greater good, to MBA-led institutions and their focus on shareholder value. “Technocratic, ‘scientific-bureaucratic’ managers” (as he called them) strived to use standardization to improve scheduling and economic efficiency, and used ‘error’ to enhance their authority by undermining clinical expertise.

But, he continued, hospital operations are surprisingly complex.  These operations, for the most part, succeed; and succeed in no small part due to the everyday adaptations of doctors, surgeons, nurses and other expert caregivers.  These experts, he suggested, know how to improve the system, and health professionals should find a way to maintain control of healthcare and enable the evolution of improved practices.  Key to this, he stated, is that health professionals, as a community, need to identify the core values of their field.

I perked up in my seat. I had been a safety professional for ten years, and this was the first time, other than the occasional professional group’s mission statement, that I had considered what the core values of the safety field might be.  To do no harm?  To maximize good?  To advance the field?  Do they include day-to-day actions, such as individual courage, or treating others with dignity?  And how does this work in fields such as mining, military operations, or autonomous entities (including artificial intelligence) where proper use of the product fundamentally and permanently alters the environment?  My mind swirled in delight.

After a quick break and a talk on defining the resilience engineering (RE) problem space and developing a framework for RE tools, we moved on to an open forum to share our views on the Symposium.  The week’s presentations had generated many trains of thought:  How do we engineer systems to be resilient?  Do we need control of our systems for them to be resilient?  How does resilience work in systems that are already there?  How do changes to existing systems impact resilience?  How can we take advantage of existing resilience?  We batted these questions around, and more, in a thoroughly engaging conversation.  Then, my favorite idea, by a lion of the field, “If we, as system designers, are relying on emergency procedures as the last control between system instability and failure, we need to give more respect to our operators.”

For those not familiar with the design and build processes, there is a structure used when safety analyses identify issues with a design.  The structure, known as the system safety order of precedence, states that when a hazard is identified during the design process, the preferred option is to revise the design to eliminate the risk.  If this can’t be done, the risk should be reduced by selecting the least risky option, or by designing in redundancies or fail-safe features that reduce the probability the risk will occur.  When this does not eliminate the risk, barriers or controls designed to reduce the spread or escalation of the risk are included, and periodic checks are included in operating instructions to ensure these features are working effectively.  If design or safety devices cannot be counted on (or are considered impractical) to effectively reduce a risk, warning signals, placards, training, and routine and emergency procedures are implemented to counter the shortcomings of a design.  Thus, the adaptability of front line operators becomes the key design feature counted on to prevent disaster.  Usually this strategy is successful, but in some cases it is not.  And when a front-line operator (or a team of front-line operators) cannot adapt at the pace required to fill gaps in engineering or design imagination, a program manager’s budget, or an operational schedule, they, and not the underlying design or operating processes, are blamed for the accident.  So I sat, grateful that someone else had said out loud the thought I had been pondering for the last few years.

Unfortunately, there were no quick and easy answers.  We spoke of adding buffers to increase flexibility, the efficiency versus thoroughness trade off, and the notion of optimizing a system for recovery.  We drifted back to problem domains, and considered the perspective of solution domains.  Someone brought up that to solve these questions, or have the opportunity to solve these questions, we would need to increase the credibility of resilience engineering as a field.  And then our time was up, and the symposium over.  With a promise to meet again, we broke for lunch in the courtyard one last time.

Leave a Reply

Your email address will not be published. Required fields are marked *