January 2016 – Holly Brunelle

Some people idolize sports figures, watching every game, memorizing stats, wearing replicas of their jerseys or clothing with their team’s emblems. Others follow politicians or entertainers, following them in the media, keeping up with their projects, hanging on their every word. I, on the other hand, idolize human factors researchers, poring over their books and journal articles and following their most recent work. This morning, in Lisbon, I found myself surrounded by my version of rock stars.

The opening plenary had been a nice discussion regarding how language (absence of risk) and negative connotation (look up something for here) has, over time, limited the ‘operating space’ of safety practitioners, with resilience presented as one possible antidote. The morning break (during which I lurked on a conversation between David Woods and Sidney Dekker, two of my favorites, swoon) was followed with lectures by practitioners who had applied the tools in the field. The program included a lecture by Atsfumi Yoshikawa, a senior engineer with the Japan Atomic Energy Agency who was on site at Fukushima Dai-ichi during the 2011 earthquake.

For those who may not remember, Fukushima Dai-ichi is a nuclear power plant located on the east coast of Japan. Positioned in an idyllic fishing region 160 miles (260 km) north of Tokyo, the plant consisted of six boiling-water reactors. Prior to the earthquake, three of the six reactors had been shut down in preparation for refueling. After the earthquake, a 9.0 temblor centered 45 miles off the coast, the three operational reactors also shut down (SCRAM-ed), leaving the plant unable to generate the power required to operate the coolant pumps. The diesel pumps kicked on, but these were located in low-lying areas and all failed shortly after they were overcome by water in the subsequent tsunami*. This left the plant without a method to circulate the water needed to dissipate the control rods’ heat, and if cooling could not be re-established the rods would have become hot enough to melt themselves in a matter of days. Dr. Yoshikawa’s presentation was a first-hand account of the days and weeks immediately after the earthquake and tsunami.

Dr. Yoshikawa began with an overview of the circumstances, then went directly to their experiences on the ground. They had known it was bad, so they asked for volunteers to remain behind to try to recover the plant. Everyone who wanted to leave was given the opportunity to do so, and he had been surprised how many stayed. He described the psychological conditions they were working under: no one on the team expected to survive, and their first act as a team was to take photos of each other for their families, all the while not knowing whether their families had survived the fifteen-foot high wall of water. He reported that since no one had considered a failure of this magnitude, available checklists were useless, and he and his team were left to improvise a response. He told of the difficult physical environment: work had to be performed wearing cumbersome protective suits, and tasks had to be planned and performed around the short intervals to reduce radiation risks, all the while with limited access to food, water and sanitation. He even admitted (bravely to my mind) to disobeying his superiors and ordering fire trucks to spray water directly on the facility, the very act that prevented meltdown.

But he did not stop there. As both and engineer and a practitioner, he was in the unique position to reflect on the assumptions and decisions that led to the situation he and his colleagues faced. He described how the conditions far exceeded any worst case scenarios that had been reviewed during safety analyses, analyses in some cases that he had chaired. He shared that the traditional view of safety, defined as freedom from unacceptable risk, had led them to discount the flexibility and resilience human operators add during contingencies as they respond to degrading (or in their case failed) system conditions. He admitted his team’s successful resolution of previous events had led them to believe the system was more resilient than it was, or ever could be. He reviewed the ‘iceberg’ model (that most threats reside unseen below the surface of everyday operations) and suggested that during emergencies, threats increase and actions intensify, creating a set of required tasks that may not be handled and increase the potential for failure. He also observed that disasters show social systems for what they really are. He ended his presentation with an apology, to his superiors, to his countrymen, and to the citizens of the world, for allowing this event to even occur. We were all blown away by his humility and honesty, and once he had finished we sat in hushed silence.

So how can you follow that? You can’t, so we broke for lunch.

More soon!

*This flooding scenario had been seen at hospitals during Katrina and would play out again during Sandy and Irene. It is my understanding that since Sandy, hospitals and other critical infrastructure have been encouraged to relocate their back-up power sources to higher ground. If you are curious how this is proceeding in your community, I encourage you to contact your Emergency Manager.

I wasn’t expecting so many surfboards. To be honest, I wasn’t expecting surfboards at all. But as soon as I saw them they made sense; Lisbon is a sheltered port just north of Gibraltar, of course they would have great waves. Alas, no surfing for me, I was here for the Resilience Engineering Symposium. But to be honest, at this point I was feeling anything but resilient.

At some point in transit I realized I had not planned this trip well. In fact, I recognized I had not been making good decisions for a while, possibly due to my aggressive travel schedule, extended encounter with high elevations, or the resulting fatigue from either or both. Long story short: by the time I reached Lisbon I had gotten eight hours of sleep over the previous four days and I was beat. Even better: I had left myself no daylight to recon the Symposium’s location (a small school in the suburbs). Fortunately I did have a nice room in a nice hotel. I pulled the blinds and hit the sack.

The idea to attend this conference had come to me during a hike in New Mexico. It had been a new (to me) trail, east and north of where I usually hiked, through piñon groves nestled along the slopes and canyons below an upscale housing development. “Don’t the resilience guys have their conference this year? Is that something I would want to do?” I was excited, because this was the first thing even remotely professional that I’d wanted to do since leaving Connecticut. Back at the lodge, I clandestinely broke internet silence to check the location and dates. Woo Hoo, there was still time for me to register and book travel! I pondered this thought during the next week (silent retreat), and once back on the road (read: experiencing sea level oxygen for the first time in weeks) I made the arrangements.

Over the years, I’ve learned there are many different factions within the safety community. There are the System Safety traditionalists, who use tools (such as FMEAs and fault trees) developed in the 1950s to support the mechanistic systems of that age, to deconstruct and analyze systems and their hazards at the component, subsystem and system levels. There is the Human Reliability Analysis (HRA) cohort who, following Three Mile Island created methods to include human performance factors in safety analyses. (These are often dismissed by the traditionalists because they present performance ranges rather than precise failure probabilities.) The Shuttle Challenger and Chernobyl accidents brought us ‘Normal Accident Theory’, the premise that humans induce instability in systems so to increase safety designers need to control (or design out) human inputs to the greatest extent possible. In counterpose, the High Reliability community believe it is systems that are inherently unstable, that human operators perform near constant local adaptations to ward off disaster, and we should develop methods and structures to understand and support these interventions. Westrum raised the concept of organizational safety cultures across a spectrum from generative to bureaucratic to pathological. Crew resource management taught us to improve our communications, and Rochlin explained safety is a social construct, an agreed-upon balance between production and protection that a community takes on over time. Then there is another group, one that suggests we should not design systems only ‘to the scene of the accident’ but beyond, to not just design out or mitigate known hazards but also include elements to sustain or recover operations in the event of unexpected disruption. These are the Resilience Engineers, the cats I had traveled to Portugal to meet.

So I woke the next morning (not quite) bright eyed and bushy tailed (is it coincidence the Portuguese term for ‘wake up call’ is ‘despertar’?), eager for the day ahead of me. I got up, brushed my teeth, did my hair, and put on some ‘grown up clothes’. After breakfast, I made the trek down the cobblestone sidewalks (in princess heels, who’s idea was THAT?) to the Metro station where, after a quick stop for some Euros, I found the Yellow Line and hopped on a train. Three stops and I was off… to a neighborhood that looked nothing like the one I had scouted on Google Maps. After some queries I was back on the train, this time six stops in the opposite direction. Once at the top of the steps it looked like the right spot. But I had trouble orienting myself, and after wandering what looked the proper direction for what felt like the proper length of time I was hopelessly lost. But no fear, there was a lobby, with what looked like a front desk. I stepped in, and what did I see? Violet and white Symposium posters! Somehow, despite my best efforts, I had reached my destination!

Once badged I entered the auditorium and found a spot towards the back. On stage was a short, thin, greying man, describing fascinating things: the negative language and frames used by many safety professionals, quantum weirdness, and accommodating variability within a system. Heaven! And this was only just the beginning.

Well, it was a busy three days so I will take a break here. More soon!

Month: January 2016

Week 18: Resilience, Part Three (First Morning)

Week 18: Resilience, Part Two (Lisbon)