Abstract
Warnings about the risks of literal-minded automation—a system that can’t tell if its model of the world is the world it is actually in—have been sounded for over 70 years. The risk is that a system will do the “right” thing—its actions are appropriate given its model of the world, but it is actually in a different world—producing unexpected/unintended behavior and potentially harmful effects. This risk—wrong, strong, and silent automation—looms larger today as our ability to deploy increasingly autonomous systems and delegate greater authority to such systems expands. It already produces incidents, outages of valued services, financial losses, and fatal accidents across different settings. This paper explores this general and out-of-control risk by examining a pair of fatal aviation accidents which revolved around wrong, strong and silent automation.
Keywords
Literal-Minded Machines
Warnings about the limits of automata—algorithm(s) embodied to carry out activities on its own when authorized by another party—began as soon as scientific progress led to tooling that enabled widespread development and deployment of automata into dynamic and risky worlds. Wiener (1950) highlighted the risks of literal-minded machines—a system that can’t tell if its model of the world is the world it is actually in. As a result, the system will do the right thing—in the sense that the actions are appropriate given its model of the world, when it is in a different world—producing unexpected/unintended behavior and potentially harmful effects (Woods & Hollnagel, 2006, chapters 10/11).
Technology advances our ability to deploy increasingly autonomous systems and our willingness to delegate greater authority to these systems (to expand the scope of authority where the system acts on its own). Risks associated with literal-minded automated systems and sub-systems loom larger every day, already having caused incidents, outages, financial losses, and fatal accidents across different industries. We see this in, for example. vehicle path control, medication infusion, plan following, stock trading, and vehicle-to-vehicle separation.
When assumed and actual worlds do not match, automated systems will misbehave, taking actions that are inappropriate and possibly dangerous. Some supervisory role has to recognize the behavior as inappropriate or dangerous and intervene by re-directing the automation. Stepping from a monitoring role into an active role in a developing non-normal or abnormal situation is a difficult shift. If the behavior and configuration of automated systems is opaque (What is it doing? What will it do next? What is the configuration currently controlling key parameters or processes?), and if the mechanisms for re-directing parts of the suite of automation are clumsy, then the integrated system has a built-in vulnerability to breakdowns where the automation is strong, silent, and wrong.
Of course, automata can function alone, but limits like literal-mindedness arise and complexity factors produce surprising challenges that exceed an automata’s competence envelope (Maguire, 2024a, 2024b). Suites of automata can be designed to coordinate with supervisory roles to produce a more robust and resilient multi-agent integrated system (Eraslan et al., 2020; Farjadian et al., 2021; Morey et al., 2020). Suites of automata can be designed to be cooperating agents in a shared activity space with others human and machine agents when disruptions occur and spread (Johnson et al., 2014, 2017; Maguire, 2024a, Maguire, 2024b). How this is done is beyond the scope of this paper. Why this knowledge is used so little remains a troubling technical, psychological, and social problem debated elsewhere (Woods, 2021).
Strong, Silent, Difficult to Direct Automation in Aviation
The adaptive pattern above played out in the introduction and expansion of flight deck automation in the1980s. The effects and changes over time, led substantially by NASA, have been well studied (supplemental material on Flight Deck Automation Studies). Driven by reports of pilot difficulties, incidents, and accidents involving modern (or modernized), highly automated flight decks, this research yielded insights into what has become known as “mode awareness,” and “automation surprises.” These denote a breakdown in human-machine coordination that could be traced to the new “strong, silent, difficult to direct” suite of automated sub-systems on the flight deck (Woods & Sarter, 2000, p. 329).
“Strong” is used to refer to the authority delegated to an automated (sub-)system. In the suite of automated systems on modern aircraft, the envelope protection sub-system is “strong” in that it has the authority to take over control of the aircraft from the flight crew if it’s inputs/internal model suggest the flight is exceeding a safe operating envelope. In this case, the delegation is from aircraft automation design and industry layers, not onboard pilots. Traditionally in aviation this is called control authority—what automated sub-system is driving what aspect of the aircraft, flightpath and flight plan—for example, thrust, pitch, heading, altitude, and changes in these. Other automated sub-systems scope of authority for control of path, plan, power, etc. vary based on instructions or directions from the flight crew (via tactical modes or more advanced modes where the automation will fly a maneuver on its own or in part). Within the pilot-delegated scope of authority, the suite of automation will do things on its own. One example that has been a contributor to incidents and accidents is “indirect mode” changes where automation changes the configuration of the automation, going beyond the specific pilot supervisory input to the automation. This issue led to the discovery of mode awareness as a contributor to automation surprises (Sarter & Woods, 1995).
“Silent” is used to refer to low observability—the form and quality of the feedback between human (and machine) agents about how the automation is configured to fly the aircraft—transitions in flight path control, flight maneuvers, flight plan—and how the aircraft actually behaves. Observability is about ability to see events and activities ahead—the near future, especially transitions (external events, internal configuration changes). As Earl Wiener put it famously in 1989: observability is how smoothly human supervisors of automation can answer questions—What’s it doing now? What is it going to do next? When unexpected events occur (in automation configuration/control; aircraft behavior; external events), observability refers to how smoothly human supervisors of automation can answer questions—Why did it do that? How did we get into that mode? Past research has shown observability of the behavior and changes in the configuration of the automation suite can be low, and deploying high autonomy/high authority automata require more sophisticated design for observability (Sarter, 2002).
Difficult to direct refers to how smoothly the design allows the flight crew to modify automated system configuration/behavior as conditions and priorities change given the tempo of operations—how the flightcrew manages the automation as the automation controls the aircraft, for example, instruct, program, configure, override, and change scope of authority. Re-directability refers to how smoothly human supervisors of automation can answer questions such as—How do I get it to do what I want? How do I stop it from doing this? Effective supervisory control requires mechanisms for re-directing automation prior to reversion to direct/manual control (note: as automation grows more powerful and central to operations here and elsewhere, reversion to manual control becomes more difficult or even impossible).
When Sensor Inputs to Strong, Silent, Difficult to Direct Automation Go Bad
The danger in how literal-minded automata contributes to incidents and accidents can arise from many sources (supplemental material on Sample Accidents). One is hidden interdependencies in software (e.g., as can be seen in radiation mis-administrations; medication mis-administrations via automated infusion devices, runaway automation in financial trading). Another is bad inputs to automated systems operating with high authority. Several commercial airliner accidents in the past decade have been linked directly to automated system sensor failures. A compelling example is formed by the twin Boeing 737 MAX accidents which together killed 346 people in 2018 and 2019.
In the latest modification to Boeing’s 737 series of aircraft, a maneuvering characteristics augmentation system (MCAS) was added to the existing suite of automated systems and given high command authority over pitch-up excursions (designed to respond to bring pitch down to a normal range via control of the horizontal stabilizer). Another system already existed to control pitch via the elevator. The design evolved to act twice as strong as the human pilot—for every pilot input gain on the elevator, MCAS makes double that gain in the opposite direction on the horizontal stabilizer. MCAS also adjusts the stabilizer at a rate faster than can be countered by pilots using electric trim switches on the control column.
Repetitive activation was built into MCAS which represents persistent high control authority—while it could deactivate after 10 seconds—it would reactivate again 5 seconds later. As long as MCAS received sensor input in its trigger range, it would continue to reactivate trying to push the nose of the aircraft down. Before the first accident in 2018—Lion Air FL610, pilots did not know about the addition or operation of MCAS. MCAS was not described by name in manuals or training.
MCAS control software operated based on input from Angle of Attack (AOA) sensors (measures of pitch attitude). Except there was only one sensor input. Individual AOA sensors, like all sensors (especially those in harsh environments), can have reliability issues. Normally sensor design adds some mix of redundancy, diversity, and software checks to ensure accurate inputs, detect when inputs may be inaccurate, and inform supervisors of the need to intervene, redirect, or takeover. But MCAS was not part of the safety case for the modified 737 MAX; why would the extra steps for extra reliability matter? [Note: Boeing was actively minimizing the chance MCAS would become part of the safety case or need extra testing for safety implications. However, the same assumption has arisen before other accidents occurred—evidence of a risk is downplayed or ignored for items not on the safety list—for example, Columbia Space Shuttle Accident.]
Another widespread assumption was present here and in the reactions to the twin accidents. Engineering considered only the controller design/software as “the MCAS system” with sensors, sensor reliability, alerts, much less supervisory control features as outside. This confusion about system boundaries and what is the integrated whole for deployed automation is common for autonomy and AI as well. In this case, the narrow perspective meant many claimed “the automation” didn’t fail. This tendency to narrow what is “the automation or the autonomy” hides the true integrated system, its complexity, overestimates reliability, and underestimates the need for resilient supervisory control (Woods, 2016).
The danger of a literal-minded machine was present—MCAS acting according to its model of the world which was mismatched to the actual world. In the real world, aircraft attitude was normal; in MCAS’s model, the aircraft was pitching up and dictated a vigorous response. In the meantime, the situation in the cockpit followed the classic automation surprise pattern with multiple lines of cognitive work interwoven as the tempo of operations increases, uncertainty is high, danger is increasing while opportunities for recovery are vanishing (Sarter et al., 1997; Woods & Patterson, 2001; Woods & Sarter, 2000). Given the development/design/testing policy that MCAS operation was not safety-related and that single sensor channel reliability was sufficient, flight crew were not provisioned with knowledge or signals to determine that MCAS was driving the abnormal behavior. Nor did they have knowledge or guidance on how to stop an automatic control system they did not know existed. In the first accident, these decisions meant the flight crew had no help to understand and intervene successfully to counter MCAS’ high command authority as the situation deteriorated.
As seen vividly in the second MCAS-driven accident, Ethiopian Airlines 302, stopping MCAS activation and counteracting the effects of its mis-control was, in part, cumbersome and in part, indirect. Ultimately, the limited alternative means for recovery/regaining control were not strong enough to match the size of the disturbances created by the misbehavior of MCAS—the pilots did not have sufficient control authority to recover without turning MCAS back on. Once re-activated, the cycle of persistent over-control occurred. This “fight for control” between the crew and the misbehaving automation has precedents in other aviation accidents (including one where different parts of the suite of automation controlled the aircraft at cross-purposes—Dutch Safety Board [DSB], 2010).
The recommended procedure, developed after the first accident, called for the pilots to use the manual stabilizer trim wheel to recover control. This was not effective because of the speed of the dive that MCAS had commanded (manual control inputs could not overcome the aerodynamic forces on the horizontal tail). This meant the only option was to switch the electric trim system back on which automatically re-activated MCAS—still programmed to take “strong and persistent” actions given the erroneous input.
Information about angle of attack sensor problems was also kept from most crews. A Boeing AR had questioned in 2015 whether MCAS was vulnerable to single AOA sensor failures, but this was dismissed by Boeing test pilots and the aircraft was delivered with MCAS dependent only a single AOA sensor input despite this. Even if MCAS software had been provisioned to report AOA sensor disagreements, from the beginning of deliveries, 80% of the 737 MAX fleet worldwide was flying around with an inoperable angle of attack disagree alert. Boeing had known this since 2015, but the FAA learned of it only after the first MAX accident (Lion Air).
Fundamentally, the design for supervisory management of MCAS as part of a suite of automated systems for different aspects of flight was virtually non-existent. In other words, the twin 737 Max accidents are a compelling exemplar of the risks from poor design of supervisory management when high authority/high autonomy automation misbehaves. It should not take 346 lives to get engineering and engineering management to understand the fundamental vulnerability highlighted by these accidents. Why? Because similar events have happened before and because much of the knowledge for joint system design has already been developed and demonstrated (supplemental material on Sample Accidents and, e.g., Johnson et al., 2014, 2017; Schraagen et al., 2022).
Conclusion: Misbehavior of Strong, Silent, Difficult to Direct Automation is a General Risk Out-of-Control
Automated systems with high autonomy and high authority will misbehave when factors combine to create a gap between the internal model of the world and the actual events/context going on in the world where the automation is deployed. This risk is inescapable and individual incidents or accidents involving misbehavior of strong, silent, difficult to direct automation occur regularly as stakeholders deploy increasingly autonomous systems with high authority in dynamic risky worlds (Woods, 2016). However, the risk is seen as an issue to be handled on a case by case basis for the designers using tools tailored for a specific sub-system/application. Organizational and financial pressures easily overwhelm engineering teams’ ability to address the risk as in the 737 MAX accidents (Dekker et al., 2022). Furthermore, fundamental research has shown there are hard limits to the achievable robustness of high authority/high autonomy systems. Systems engineering could expand its scope to engage techniques/models for design for supervisory management as part of joint activity of systems of multiple automated and human roles (Johnson et al., 2014, 2017). Furthermore, methods for designing resilient layered control have been developed recently (Eraslan et al., 2020; Farjadian et al., 2021; Woods & Balkin, 2018).
Equally, stakeholders regularly discount the systemic and organizational lessons from these breakdowns (Woods et al., 2010). Some see each individual incident or accident involving misbehavior of strong, silent, difficult to direct automation as simply basic design failures which have direct engineering solutions (MCAS design could have used standard methods for sensor redundancy/diversity). In hindsight, other analysts determine there was some path available to escape the deteriorating situation—therefore, no systemic lessons or changes are needed (as Boeing continues to assert on 737 MAX accidents). Ironically, these analysts operate with knowledge/time/resources unavailable to supervisors responsible for a risky system. Supervisors in the scene face uncertainties, overload, and pressures after-the-fact analysts miss or underplay.
The solution is straightforward. More robust and resilient control is indeed possible if and only if stakeholders recognize the risk as fundamental and expand the systems engineering concepts/techniques. All parties engaged in deploying high authority/high autonomy systems/sub-systems should prioritize misbehavior of strong, silent, difficult to direct automation as a top level failure mode in their risk analyses, testing programs, and assurance processes.
Unless and until this shift is normative, formally and informally, the race to deploy high authority/high autonomy systems will be accompanied by incidents/accidents driven by misbehavior of strong, silent, difficult to direct automation. Underlying science has shown that the risk arises from architecture/design decisions and has shown the way to architect more resilient system architectures (Woods, 2018, 2024; Nakahira et al., 2021). There are pragmatic means available to achieve gains from high authority/high autonomy systems while also provisioning safeguards against wrong, silent, and strong automation. But utilizing the knowledge requires a substantial shift at organizational levels to re-balance/re-prioritize the trade-off between maximizing short-term gains while discounting longer-term risks of autonomy. The short-term financial costs of the joint systems and resilience engineering methods do not outweigh the long-term risks as exemplified by massive financial losses in some cases 1 (e.g., Knight Capital case in Supplementary material: Sample Accidents) and by the deaths of 346 people in the twin Boeing 737 MAX accidents.
Supplemental Material
Supplemental Material - Wrong, Strong, and Silent: What Happens when Automated Systems With High Autonomy and High Authority Misbehave?
Supplemental Material for Wrong, Strong, and Silent: What Happens when Automated Systems With High Autonomy and High Authority Misbehave? by Sidney W. A. Dekker and David D. Woods in Journal of Cognitive Engineering and Decision Making.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
Note
).
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
