Desktop version

Home arrow Management

  • Increase font
  • Decrease font

<<   CONTENTS   >>

Assessment of Performance

Table of Contents:


In this final chapter, I want to look at some of the issues associated with assessing competence. In the previous chapter, I looked at some examples of crew resource management (CRM) assessment frameworks in order to explore the idea of competence as implemented in these tools. I made the point that a competence model is not an assessment tool but, by using these frameworks as a starting point, it raised the question of what aspects of behaviour could be captured. The approaches illustrated were all based on the idea that performance needs to be observed in the workplace and that the assessment tool is simply a framework for capturing significant samples of behaviour. Observable outputs are, of course, simply the result of hidden cognitive processes and, thus, a comprehensive appraisal of overall competence is probably not possible through workplace observation alone. We may need a different approach.

In this chapter, I look at the problem of assessment from the perspective of reliability: the data collected must stand up to scrutiny or the conclusions we draw will not necessarily guarantee that crew are fit for purpose. I will look at the main sources of unreliability: the assessment tool, the method of grading, the assessors, the mechanics of the process and the context of assessment. We will look at some of the statistical methods we can use to validate data collection tools. Before we get into the detail, there is a fundamental issue that needs to be addressed: what are the differing needs of the various stakeholders in the process?

Why Assess?

Ultimately, we assess performance because the regulations in force in our jurisdiction require us to. However, this simple demand creates quite a complex problem and the answers to the question ‘why assess?’ could shape both what we assess and what we use the data for. There are four key stakeholders in competence assessment. First, the individual being assessed wants to know of their progress in training or their performance against a standard, which is usually some organisationally- mandated benchmark of ‘acceptable’ proficiency. Second, the employer, the airline, wants to know if crews are fit for purpose. Can all crew be used for all possible tasks or are there any constraints on their use? Constraints represent an enforced underutilisation of an asset and, therefore, foregone returns. Third, the training system wants to know if the training inputs are achieving the goal of producing a competent workforce. Finally, the regulator wants to know if pilots are legal to fly. These four requirements are usually met by a single process: observation on a check ride or in a simulator event using an assessment scheme that is probably not optimised for any specific stakeholder. A failure to recognise that these differences exist in the first place will result in schemes that serve no real purpose other than to satisfy the very first requirement mentioned above: we do it because we have to.

There are four commonly used terms in assessment. Summative assessment refers to progress against an agreed standard. For example, an Instrument Rating is usually a profile flown to specified limits laid down by the Regulator. Formative assessment refers to measurement that supports individual personal progress. We may want to establish the extent to which a trainee has mastered a particular aircraft system, for example. A diagnostic test might be set at the start of training to establish the extent of a trainee’s prior knowledge and, therefore, what additional training they need. Criterion-referenced assessment uses an external benchmark against which to compare a score. In the case of the Instrument Rating, it must be flown to standards laid down in regulations. Norm-referenced assessment looks at performance against a reference group. A prospective command candidate could be assessed against a group of recently upgraded captains. The implications that flow from this discussion are that, as well as establishing a benchmark of acceptable performance, we need to consider if the benchmark needs to accommodate differences in training and experience. For example, should new-hire first officers (FOs) be assessed against the same standard as senior captains? We also need to consider how these differences affect the feedback we give to candidates. Do we simply offer an absolute result (pass or fail) or is it better to frame feedback in developmental terms.

Competence assessment is a subjective process in that we are asking one person to express an opinion about another person. More accurately, in an aviation context, it is an act of discrimination. We are choosing between those that are acceptable and those that need additional attention. To make these discriminations, we must first gather evidence against a set of constructs or ‘markers’ and then we must assign a value to the performance.

<<   CONTENTS   >>

Related topics