When an Automation Fails in the System,Who Hears? A Response to Skraaning and Jamieson

Abstract

The concept of the “lumberjack effect” addresses important concerns regarding the role of automation (and impact of automation failure) in complex engineering systems such as aviation, power plant control rooms, and others. Conflicts between “narrow” and “broad” definitions of automation failures uncover logical inconsistencies regarding human-automation interactions processes in maintaining system performance. However, the “lumberjack effect” debate obscures essential considerations that automation tools exist as components in complex systems with multiple levels of abstraction hierarchies (Rasmussen). A sole focus on automation failure conflates the automation components with the overall system function. Elaborating operating conditions associated with human-automation interaction conflicts and failures supports an improved approach to failure mode and effects analysis techniques for improving system safety. Detecting system failure modes and operating conditions, poorly quantified in other risk priority determinations, are important contributions at the overall system analysis level, not as an isolated emphasis on the automation itself.

Keywords

human automation interaction topics cognitive systems engineering human system integration failure modes and effects analysis accident analysis errors mental models system dynamic analysis methods abstraction hierarchies

Introduction

Skraaning and Jamieson’s paper (SJ: (Skraaning Jr & Jamieson, 2023)) addressing “narrow” and “broad” views of automation failure provide both an important set of insights, and expose some troubling concerns, regarding cognitive human factors engineering approaches to human-automation interaction (HAI). An ongoing debate regarding these aspects of automation failure centers around the metaphoric “lumberjack effect”: with more complex and higher levels of automation rather than human task allocation (the “taller the trees”), the impacts of automation failure lead to more consequential and catastrophic outcomes (the “harder they fall”) (Hauptman & McNeese, 2022).

One potentially problematic element of this “lumberjack effect” debate is whether the focus on “automation failure” is the appropriate level of analysis and critique for these elements of system design, analysis, and intervention. This is not to say that automation failures do not occur, or that they are not problematic in their effects on complex systems operations. Further, there are many aspects of complex systems operations where elimination of automation, or restrictions of automated system functions to purely human-controlled teleoperations, would render these systems completely inoperable. Consider, as an exemplar instance, the performance of automated systems in spaceflight exploration and science missions (Caldwell, 2023). Earth-based distributed supervisory team coordination and information flow in human spaceflight operations (let alone fully autonomous and robotic missions) fundamentally requires multiple automated systems to perform accurately and appropriately, and to effectively provide information to support distributed human expertise, for such missions to be accomplished at all (Caldwell, 2005, 2005b; Caldwell et al., 2007; Caldwell & Onken, 2012).

Spaceflight exploration quintessentially involves extending human awareness, information flow, and task performance into previously unknown domains. Humans do not have the capability to perform these extensions without the benefit of complex automation systems operating in novel and unexpected conditions. The debate about the lumberjack effect has its origins here, in the design phase of the automation systems themselves, and may be laid at the feet of the (human) designers of those systems. Engineering systems operating in novel conditions cannot be designed to be infinitely robust, or capable of anticipating all possible operating conditions (Caldwell, 2014). An automation system with a fixed and limited level of data processing, component response capability, and performance reliability is necessarily limited in its operational range over time. Operating envelopes regarding system component designs and performance capabilities begin with the system-as-designed, with any associated assumptions of potential risks of failure and mechanisms for how to respond to them during the performance of the system-as-implemented (Liu et al., 2013; Sharma & Srivastava, 2018).

An explicit aspect of spaceflight event/fault detection, isolation, and recovery (EDIR/FDIR) anomaly resolution is that engineering components, sensors reporting the state and function of those components, telemetry information flows between components and monitoring systems, and supervisory control interfaces to those monitoring systems can all be sources of a reported anomaly and need to be assessed for correct functioning (Caldwell et al., 2007; Onken & Caldwell, 2009, 2011). Such risk analysis approaches are intended to examine system operating capabilities and responses to potential failures throughout the system development and operational cycles from initial design through operational life and into final retirement (Buede, 2000; Fiksel, 2003; Maier, 1998; NASA, 2007; Suranto, 2015). Automation designer hubris, not the automation system itself, is responsible for any designs that require an automation system to autonomously, completely, and independently function in all combinations of dynamic and unaddressed environmental conditions or system configurations in any complex task setting. In this sense, the height of the tree is not the level of the automation itself, but the hubris of the design engineer and the brittleness of the resulting design to respond to dynamic operating challenges. This issue of system design and risk analysis will be addressed again later in this response.

Logical Versus Empirical Disconfirmation

Two types of hypothesis testing and knowledge advancement are common in the sciences and engineering. Most work in areas of cognitive psychology, human factors engineering, and other branches of engineering operate on the basis of empirical disconfirmation: we set up a hypothesis, collect data, and try to determine if the hypothesis is likely supported by the data collected. Logical disconfirmation can be a much stronger form of knowledge advancement. If we propose an explanation of the world, and it can be logically shown to be inconsistent or contradictory to other explanations about the same phenomena at the same scale of analysis, we know we have a significant problem where some component of the explanations must be false. This is a fascinating, and potentially powerful, aspect of the SJ discussion. Both the “narrow” and “broad” definitions of automation failure presented by Wickens and critiqued by Skraaning and Jamieson have very concerning logical weaknesses. A broad (liberal) definition of automation failure that suggests that any misalignment of the operator’s mental model of automated system performance with the actual design or implementation of the automation qualifies as automation failure (SJ, second manuscript page, second column) is exceptionally suspect. The very process of human learning can be defined as a refinement of cognitive models and associations of condition, cause, and effect through increased task-relevant experience. By definition, the “mistakes” class of human error analysis represents conditions where the operator performs an action based on an incorrect understanding of system state or the effect of that action; errors of omission or commission can also result from operator misunderstandings or lack of knowledge about the true system state (Reason, 1988, 1990; Swain, 1990). A novice student pilot’s failure to understand how an aircraft autopilot operates, or the difference between descent rate versus angle, is not a failure of the autopilot automation. Even an experienced pilot may be exposed to a new automation technology for the first time, and use an incorrect cognitive model based on how previous technologies have functioned in their prior experience. This would not be seen as automation failure, but negative transfer of training (Grossman & Salas, 2011; Liu et al., 2008).

Similarly, a very narrow (conservative) definition of automation failure as interpreted (SJ, second manuscript page, first column) seeks to draw hard distinctions between the actual task operations, and the “support” features of those task operations (such as information presentation, state projection, or decision support options. This type of component distinction seems to reject the design and implementation of human supervisory control systems and any version of the Sheridan “level of automation” hierarchy supporting system interfaces between human interactive systems and task interactive systems, that is, levels 2–9 on the 1–10 scale (Sheridan, 1987, 1992; Sheridan, 2011; Sheridan et al., 1978). It becomes a logical impossibility to suggest that automation components without their related support and information flow systems represent the full automation system unless and until the engineering system as a whole functions continuously at a Sheridan level 10 automation (i.e., no human input or interaction at all during system operations).

Such results imply that neither the narrow nor the broad definitions of automation failure as discussed are useful in our study of complex human-automation interactions (HAI). A fundamental defect in the lumberjack effect debate is the confusion of what and how the automation being addressed relates to the system as it operates, and the functions that such a system is designed, or is in reality capable, to perform in order to support desired goals.

The Automation is Not the System

As described above, automation is an essential contributor to spaceflight exploration, whether or not the humans themselves are in space (Caldwell, 2023). However essential though, the purpose of spaceflight automation ranging from German clock drives to on-board state determination and error correction for radio telescope observatories is not just the creation or operation of the automation itself. In other words, the spaceflight automation is not the entire spaceflight mission, nor does it exist as a standalone entity decontextualized from the purpose of the system. Here, the analysis of any complex system’s automation components must be viewed in terms of the abstraction hierarchy of purpose, function, and physical form, as introduced by Rasmussen (Rasmussen, 1985).

Conceptually, the allocation of functions to automation has long been seen as a potentially valuable method for increasing system safety and reliability. For many decades since Fitts, the question has always been how, where, and when to allocate those functions (Fitts et al., 1951; Fitts & Posner, 1967). Rasmussen’s emphasis on abstraction hierarchies and ecological displays of system operations has been a specific focus on increasing both the transparency of the system’s operation and the elaboration of “maps” of the linkages of the system’s physical forms and component interactions to the purpose and function of the engineering system and its operations in the world (Lind, 2003; Rasmussen, 1985, 1988; Sheridan, 2017; Vincente & Rasmussen, 1992). The power plant examples at the basis of the SJ work directly relate to the abstraction hierarchy work performed by Rasmussen and others (Lind, 2003; Rasmussen, 1988; Reising, 2000; Vincente & Rasmussen, 1992). There is a very substantial problem, though, in confusing the performance of a specific version (“physical form”) of an automation implementation with the purpose or abstract function of an automation subsystem of a complex engineering system (such as a power plant, aircraft, or spacecraft). Further, the strict restriction on which inputs, outputs, and processes are within scope (as implied by the narrow definition of automation failure) is even more problematic in assessing a single specific form to a general abstract function of automation.

Unfortunately, this discussion of the abstraction hierarchy of specific automation version implementations and their effects on system operations seems to be missing (at least to this author). In fact, there is a broader consideration of how crucial terms are used in order to support such analyses. It seems without question that the purpose of human factors engineering analysis and research on HAI is to improve the overall reliability of complex, high risk systems that utilize complex HAI, not simply to decompose and point-optimize specific automation components. The distinction of system analysis and component function, however, can be elaborated in the differences in terminology use regarding the concept of “mode” in the SJ and Wickens work, compared to other treatments of systems engineering and risk analysis.

Modes and Modes

A text search of the word “mode” in SJ (Skraaning Jr & Jamieson, 2023) helps to illuminate one of the concerns of an excessively limited view of the discipline when addressing system failure (especially when compared to the authors’ considerations of automation failure). Four instances of the term exist in the paper (with a fifth in the reference section, within the title of the Sarter and Woods paper (Sarter & Woods, 1995) on “mode error” in supervisory control systems). Each of these four in-text uses addresses an aspect of “mode” in the sense of configurations, programs or settings intended to execute a particular type of aviation function (autopilot, final approach flare, or throttle position hold). Sarter’s later work on the topic (Sarter, 2008) indicates that “[m]ode awareness refers to an operator’s knowledge and understanding of the current and future automation configuration, including its status, targets, and behavior” [pg. 506], implying that the aviation “mode” of interest is the functional configuration of the automation-as-implemented (whether or not it faithfully executes the capabilities of automation-as-designed).

It is surprising, then, that a paper addressing fatal accidents in the aviation context does not have any reference to the rich historical tradition, born and developed in the aerospace industry, of risk evaluation or risk assessment known as failure modes and effects analysis (FMEA) (Liu et al., 2013; Sharma & Srivastava, 2018). In the FMEA-common usage, “mode” is not simply a particular automation configuration, but an overall combination of features, functions and conditions that can or will (or in a post-hoc analysis, did) lead to an overall system failure (not simply of a single component). Failure mode in this sense is more analogous to the overall pattern of holes allowing a situation to progress to a catastrophe in Reason’s “swiss cheese” model of error (Reason, 1988, 1990), not the characteristics or locations of specific holes.

Also lost in the SJ discussion of modes is a broader consideration of causal and contributing factors in the determination of (human) error and system failures. Many error causation models have been developed since the original Swain conceptualizations (Swain, 1990); major weaknesses in a number of these models include excessive emphasis on a single source of fault, or a single contributing causal factor. The second issue, of single versus multiple causal factors, is addressed in a discussion of the “pinball versus pachinko” model of error causation analysis (Caldwell, 2008; Thomadsen et al., 2003). Is any automation failure addressed in the lumberjack model cases really just a single-cause failure, or a more complex set of interactions of components (including humans), interfaces, and understanding of context (including that of designers and operators)? The answer to such questions is probably not found simply in the discussion of the difference between narrow and broad interpretations of automation failure or empirical assessments of specific automation configurations.

Toward a Discussion of Contexts and Risk Impacts of Human-Automation Interactions

Despite these critiques, this author sees a potentially significant and novel contribution to the systems safety and FMEA literature arising from the lumberjack effect debate. An important aspect of quantifying results of FMEA analysis is the development of a “risk priority numbers” (RPNs) addressing probability of failure mode, severity of the failure, and likelihood of detection of the failure mode as it is progressing (Liu et al., 2013). A frequent critique of RPN calculations is that the determination of failure mode detectability is challenging at best. This is, however, one of the important elements actually provided by the lumberjack effect studies and debate: how well can designers anticipate, or operators and the automation itself detect, the combination of conditions and contexts that represent an impending failure mode (not to be confused with automation mode). Both simulator studies and retrospective analyses of sentinel incidents (not just fatal accidents, but near misses with operator reports) of the type that SJ and Wickens describe help us to understand where and how our understanding is limited and condition recognition and recovery is impaired.

The results of such studies and analyses would not assess the strength of a purported lumberjack effect, but something more crucial to overall system safety analysis. In essence, the analysis provides the first steps towards a quantitative determination of the Contextualized Risk Impacts on Safety and Performance of Human-Automation Response Dynamics (“CRISP-HARD”). Such “CRISP-HARD” metrics not only provide important insights to the FMEA RPN determination of failure mode likelihood of detection, but also support processes of troubleshooting during real time EDIR/FDIR anomaly resolution.

Conclusion

The lumberjack effect debate between Skraaning and Jamieson, and Wickens and colleagues, regarding automation failure represents, in this author’s view, a potentially valuable consideration of HAI, but at an incorrect level of systems safety analysis. The specific forms of the automation components for powerplants, aerospace missions, or other engineering systems are not to be studied or addressed separately from the lifecycle of task and function allocations of the broader functions and purposes of the systems of which the automation is a part. Our studies of when and how automation capabilities are limited in different performance contexts, and how these limitations are hidden from designers or operators, can provide important contributions to the overall assessment and improvement of system safety. However, these contributions require awareness of the abstraction hierarchy and context of the system and its failure modes, not an isolated discussion of the automation as a singular focus.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Barrett S. Caldwell

Barrett S. Caldwell, PhD is a Professor in Industrial Engineering (and Aeronautics & Astronautics) at Purdue. His PhD (Univ. of California, Davis, 1990) is in Social Psychology; his two BS degrees are from MIT (1985), representing an interdisciplinary integration of Aeronautics and Astronautics with Humanities. Prof. Caldwell’s research team, known as the Group Performance Environments Research (GROUPER) Laboratory, examines and improves how people get, share, and use information well. GROUPER research highlights human factors engineering approaches to information flow, task coordination, and team performance in settings from healthcare to spaceflight to STEM education. Prof. Caldwell has authored over 200 scientific publications and graduated 20 PhD and over 35 MS thesis students as advisor or co-advisor.

References

Buede

D. M.

(2000). The engineering design of systems. John Wiley & Sons, Inc.

Caldwell

B. S.

(2005a). Analysis and modeling of information flow and distributed expertise in space-related operations. Acta Astronautica, 56(9-12), 996–1004. https://doi.org/10.1016/j.actaastro.2005.01.027

Caldwell

B. S.

(2005b). Multi-team dynamics and distributed expertise in mission operations. Aviation Space & Environmental Medicine, 76(6), B145–B153. https://pubmed.ncbi.nlm.nih.gov/15943207.

Caldwell

B. S.

(2008). Tools for developing a quality management program: Human factors and systems engineering tools. International Journal of Radiation Oncology, Biology, Physics, 71(1 Suppl), S191–S194. https://doi.org/10.1016/j.ijrobp.2007.06.083

Caldwell

B. S.

(2014). Cognitive challenges to resilience dynamics in managing large-scale event response. Journal of Cognitive Engineering and Decision Making, 8(4), 318–329. https://doi.org/10.1177/1555343414546220

Caldwell

B. S.

(2023). Space exploration and astronomy automation. In Springer handbook of automation (pp. 1139–1157). Springer.

Caldwell

B. S.

Byrd

K. S.

Onken

J. C.

Roberts

M. C.

(2007). Flight controller information technology use for task coordination in mission operations. International Association for the Advancement of Space Safety.

Caldwell

B. S.

Onken

J. D.

(2012). Simulation and human factors in modeling of spaceflight mission control teams. International Symposium on Resilient Control Systems.

Fiksel

(2003). Designing resilient, sustainable systems. Environmental Science & Technology, 37(23), 5330–5339. https://doi.org/10.1021/es0344819

10.

Fitts

P. M.

Posner

M. I.

(1967). Human performance. Brooks/Cole.

11.

Fitts

P. M.

Viteles

M. S.

Barr

N. L.

Brimhall

D. R.

Finch

Gardner

Stevens

S. S.

(1951). Human engineering for an effective air-navigation and traffic-control system, and appendixes 1 thru 3. Ohio State Univ Research Foundation Columbus.

12.

Grossman

Salas

(2011). The transfer of training: What really matters. International Journal of Training and Development, 15(2), 103–120. https://doi.org/10.1111/j.1468-2419.2011.00373.x

13.

Hauptman

A. I.

McNeese

N. J.

(2022). Overcoming the lumberjack effect through adaptive autonomy. In Proceedings of the human factors and ergonomics society annual meeting. Sage.

14.

Lind

(2003). Making sense of the abstraction hierarchy in the power plant domain. Cognition, Technology & Work, 5(2), 67–81. https://doi.org/10.1007/s10111-002-0109-4

15.

Liu

McSorley

Blickensderfer

Vincenzi

D. A.

Macchiarella

N. D.

(2008). Transfer of training. In Human factors in simulation and training (pp. 109–124). CRC Press.

16.

Liu

H.-C.

Liu

(2013). Risk evaluation approaches in failure mode and effects analysis: A literature review. Expert Systems with Applications, 40(2), 828–838. https://doi.org/10.1016/j.eswa.2012.08.010

17.

Maier

M. W.

(1998). Architecting principles for systems of systems. Systems Engineering, 1(4), 267–284. https://doi.org/10.1002/(sici)1520-6858(1998)1:4<267::aid-sys3>3.0.co;2-d

18.

NASA . (2007). NASA systems engineering handbook. (SP 2007-6105, Rev 1). NASA. Retrieved from. https://www.nasa.gov/sites/default/files/atoms/files/nasa_systems_engineering_handbook.pdf

19.

Onken

J. D.

Caldwell

B. S.

(2009). Towards information coordination and reduced team size in space flight mission operations. Proceedings of the Human Factors and Ergonomics Society - Annual Meeting, 53(1), 101–105. https://doi.org/10.1177/154193120905300122

20.

Onken

J. D.

Caldwell

B. S.

(2011). Problem solving in expert teams: Functional models and task processes. In Proceedings of the human factors and ergonomics society 55th annual meeting -- 2011. Sage.

21.

Rasmussen

(1985). The role of hierarchical knowledge representation in decisionmaking and system management. IEEE Transactions on Systems, Man, and Cybernetics, 15(2), 234–243. https://doi.org/10.1109/TSMC.1985.6313353

22.

Rasmussen

(1988). Human factors in high-risk systems. In Conference Record for 1988 IEEE conference on human factors and power plants. IEEE.

23.

Reason

(1988). Framework models of human performance and error: A consumer guide. In Goodstein

L. P.

Anderson

H. B.

Olsen

S. E.

(Eds.), Tasks, errors, and mental models (pp. 35–49). Taylor & Francis.

24.

Reason

(1990). Human error. Cambridge University Press.

25.

Reising

D. V. C.

(2000). The abstraction hierarchy and its extension beyond process control. Proceedings of the Human Factors and Ergonomics Society - Annual Meeting, 44(1), 194–197. https://doi.org/10.1177/154193120004400152

26.

Sarter

(2008). Investigating mode errors on automated flight decks: Illustrating the problem-driven, cumulative, and interdisciplinary nature of human factors research. Human Factors, 50(3), 506–510. https://doi.org/10.1518/001872008X312233

27.

Sarter

N. B.

Woods

D. D.

(1995). How in the world did we ever get into that mode? Mode error and awareness in supervisory control. Human Factors: The Journal of the Human Factors and Ergonomics Society, 37(1), 5–19. https://doi.org/10.1518/001872095779049516

28.

Sharma

K. D.

Srivastava

(2018). Failure mode and effect analysis (FMEA) implementation: A literature review. J Adv Res Aeronaut Space Sci, 5(1-2), 1–17.

29.

Sheridan

Verplank

Brooks

(1978). Human and computer control of undersea teleoperators. Man-Machines Systems Lab.

30.

Sheridan

T. B.

(1987). Supervisory control. In Salvendy

(Ed.), Handbook of human factors (pp. 1243–1268). John Wiley and Sons.

31.

Sheridan

T. B.

(1992). Telerobotics, automation, and human supervisory control. MIT Press.

32.

Sheridan

T. B.

(2011). Adaptive automation, level of automation, allocation authority, supervisory control, and adaptive control: Distinctions and modes of adaptation. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, 41(4), 662–667. https://doi.org/10.1109/tsmca.2010.2093888

33.

Sheridan

T. B.

(2017). Musings on models and the genius of Jens Rasmussen. Applied Ergonomics, 59(Pt B), 598–601. https://doi.org/10.1016/j.apergo.2015.10.015

34.

Skraaning

Jamieson

G. A.

(2023). The failure to grasp automation failure. Journal of Cognitive Engineering and Decision Making, Article, 15553434231189375. https://doi.org/10.1177/15553434231189375

35.

Suranto

(2015). Systems engineering; Why is it important? International conference on information technology and business applications. Palembang, Indonesia.

36.

Swain

A. D.

(1990). Human reliability analysis need status trends and limitations. Reliability Engineering & System Safety, 29(3), 301–313. https://doi.org/10.1016/0951-8320(90)90013-d

37.

Thomadsen

B. R.

Lin

S.-W.

Laemmrich

Waller

Cheng

Caldwell

B. S.

Rankin

Stitt

(2003). Analysis of treatment delivery errors in brachytherapy using formal risk analysis techniques. International Journal of Radiation Oncology, Biology, Physics, 57(5), 1492–1508. https://doi.org/10.1016/s0360-3016(03)01622-5

38.

Vicente

Rasmussen

(1992). Ecological interface design: Theoretical foundations. IEEE Transactions on Systems, Man, and Cybernetics, 22(4), 589–606. https://doi.org/10.1109/21.156574