The Structure of the Aviation System – Aviation as Hierarchical Decision-Making
Although I have proposed that a system comprises physical entities - people, organisations, technology - assembled to deliver a performance, in this section I want to map out how that system behaves. The system is a framework for making decisions and the behaviour of that system reflects the manner in which those decisions are enacted. Leveson’s hierarchy describes the different types of decision that are required for aviation to w'ork. Because different actors reserve to themselves the right to make decisions in specific areas, the system is rendered tangible. For example, whereas an individual makes decisions about job-related demands, a company has the right to make decisions about who to employ and what aircraft to buy, for example. I have identified four levels of decision-making in my model but we also need to recognise that they are all embedded in a broader environment. The environment comprises those factors that shape activity but are not, in return, shaped by that activity. Environmental factors include the climate and weather, economic cycles, geopolitics. The four levels are as follows:
Level 1: Production. This is the lowest level and represents the individual in the workspace. This is where skills, tools and processes come together to generate work. Fundamentally, Level 1 describes how an individual engages with the world in order to achieve a task-related goal. I refer to it as the level of production and decisions are typically those that result in individual goal- directed action.
Level 2: Collaboration. At this level, we are recognising that work is distributed across space and time but, more important, it also involves many players. Level 2 is concerned with all those actors collaborating as part of a task. They may be on-board the aircraft or off-board. The off-board crew' forms and disperses according to needs dictated by the phase of flight and could be considered a virtual team that support the aircraft throughout its trajectory. Decisions at this level are typically those associated with the organisation and conduct of work, with agreeing solutions and the allocation of responsibilities though not necessarily the actual ‘doing’. Level 2 exerts control over Level 1 typically through social rules, formal job descriptions and group authority structures.
Level 3: Exploitation. At this level, we are concerned with both the profitmaking and not-for-profit entities represented by airlines, maintenance organisations, ground handling companies, air navigation service providers, equipment manufacturers, etc. The role of the commercial entity is to generate returns on investment through the use of aircraft, crew' and other resources. In the case of not-for-profit organisations, capability has to be provided within a budget. Level 3 is w'here decisions are made about the organisation of production. Control is exercised through commercial and employment contracts, codes of conduct, policies and procedures.
Level 4: Facilitation. The goal of decision-making at this level is to facilitate activity through the provision of tools of production, capital assets, regulations for the conduct of work, etc. Regulatory authorities define the manner in which companies can deliver services. Control over the lower levels is through granting approvals and permissions: we are concerned with constraint.
Of course, a ‘system’ is a disembodied entity: it does not ‘make’ decisions. Individuals make decisions. The key point is that, at each level, individuals make those decisions to achieve a goal that falls within the area of responsibility of that level. Remembering that this hierarchy simply describes a nested structure as proposed by Rasmussen and Svedung, applied to aviation, what is of more importance is how the system functions.
Behaviour within a System
Each level in the system is focussed on a different set of goals and, as we move up the hierarchy, decisions are further removed from the point of action. The configuration of the elements of the system shapes behaviour. Furthermore, systems are not static structures fixed in space and time: they are dynamic, fluid entities constructed to solve problems of production. As we have seen, the structural aspects of a system comprise the component parts, the tools and processes and the formal linkages between those parts. These structural aspects provide the framework to support operations and are important but the observed outcomes flow from individuals trying to achieve goals while working within the constraints of the system. For example, the lack of a set of scales to weigh passengers at Pelee Island Airport did not cause the aircraft to crash. The need to complete a weight and balance calculation as part of the aircraft dispatch process represents a solution to satisfy a decision made by the regulator about the safe conduct of operations. On the day, the pilot created an alternative solution to an operational problem. The ability of a system to cope with the inherent variability in normal operations reflects its level of resilience. Hollnagel has defined resilience as:
The intrinsic ability of a system to adjust its functioning prior to, during or following changes and disturbances so that it can sustain required operations under both expected and unexpected conditions.
(Hollnagel et al., 2011)
Resilience, then, is about the recognition of potential disturbances, developing adequate responses and restoring performance predictability. It is about retaining and regaining control of events. Predictability is an essential property of a system from an organisation’s perspective while unpredictable outcomes jeopardise viability. Change and disturbance need not be catastrophic to bring about a significant breakdown.
Action takes place in a bounded space in that activities in dynamic processes typically have a point, under normal circumstances, by which they must be completed. The concept of boundaries, though, is problematic. Woltjer (2008) comments that ‘no one knows where exactly the absolute boundaries of safe intervals lie’. Mendonca (2008) suggests that systems will have multiple boundaries and that measures of margin might include limits of performance (effort) and the borders between organisations (entry/exit points). Broadly speaking, a boundary reflects the freedom of action a system possesses. A boundary may be defined in absolute terms, such as the speed at which an aircraft will stall, or in relative terms, such as society’s willingness to tolerate certain types of behaviour. A boundary acts as a significant constraint on performance and represents the point at which operations are no longer viable. If we follow Mendonsa’s argument, however, we can also identify interfaces in a system that are transactional boundaries between operational entities. Interfaces exist between collaborating individuals and between different work teams. For example, it is well-understood in aviation that the flight deck door is a significant, if metaphorical, interface between the flight deck and the cabin crew. Boundaries are edge states that demand attention and a failure to act effectively in relation to a boundary or an interface will increase the risk of failure. The management of boundaries and interfaces requires effort.
Woods (2006) proposes two properties of system performance in relation to boundaries: margin and tolerance. Margin describes ‘how closely or how precarious the system is currently operating to a performance boundary’ while tolerance describes ‘how the system behaves near a boundary’. Margins, then, represent the zone approaching the point at which a system can transition from a desired to an undesired state because task demands overwhelm the control structures. Some authors talk about margins being ‘eaten up’ or experiencing ‘slow erosion’. I would argue that the margin is the point at which a need to intervene to achieve the desired goal can be identified. If we think of declarative knowledge as constraint sets, then the point at which the probability of violating a constraint is discernible marks the distal edge of the margin. Margins are zones in which discrepant signals emerge.
Tolerance has been interpreted as the nature of failure at a margin: is the system brittle or is it capable of graceful degradation (Woltjer, 2008)? I want to add a third state, which is that of inertia. Galbraith (1954), in his study of the Great Depression, talks of ‘the bezzle’. In all financial systems there is an element of corruption, the true extent of which is usually only revealed when the system comes under stress. In a financial crash, every penny is needed and only then do you realise that some have gone missing or have been embezzled. In better times, while graft may be suspected, there is little incentive to invest the additional effort needed for its eradication. An inert failure, then, is a mode in which the system still functions, albeit in a recognisable suboptimal state, but it usually takes an additional shock to bring about failure.
Mendonca proposes that tolerance is not simply the nature of performance but how that performance is achieved. He suggests that measurement will need to capture process-level descriptions of behaviour. For example, Neerincx (2003) describes a three-dimensional model of cognitive task load that includes information processing, task-set switching and percentage of time occupied by a task. If a system’s margin represents the time and space available in which to complete an actual task then tolerance, in this formulation, would be manifested in the ways in which operators manage task switching, the manner in which they track the time available and their ability to process and manipulate information to maintain control. I prefer to reserve tolerance for the characteristic of failure and, instead, describe performance in the margin in terms of the efficacy of the operator’s action in anticipating change, sustaining the systems status and restoring stability.
What emerges from this discussion is that competence in normal operations must cope, primarily, with situations where signals might be tentative or obscure but small changes can have a significant negative impact. Behaviours associated with information seeking and signal interpretation, around efficient load shedding and task modification are needed to contain operations and prevent a potential breach of a boundary condition.
Before we apply this formulation of a system to some examples, I want to just summarise the key features. A system comprises people and technology assembled to achieve a goal. Action flows from decisions about how to configure technology and organise work to achieve the desired goal. A goal will have an associated set of constraints that must be satisfied if the goal is to be successfully accomplished. Action takes place in a bounded volume of space and time and a failure to act effectively in relation to these boundaries will increase the risk of a degraded outcome or, even, failure. Boundaries have associated margins in which signals relating to the status of the task become apparent. Under normal circumstances, these signals simply relate to the approaching state change as we shift from one goal to the next. Occasionally, signals relate to a need to intervene to restore system status. The actions of agents involved in managing the system can be assessed in terms of their efficacy in relation to accomplishing the goal. Failure at the boundary can take one of three forms: graceful, brittle or inert. Having mapped out the key features, I want to go back to the previous discussion of the Pelee Island accident and look at the elements through the lens of a systems model.