Assessing mental toughness—MTQ48

Peter Clough and Keith Earle

Establishing validity and reliability is considered to be an important ongoing process in our work on mental toughness.

The various forms of testing we have carried out over the last ten years have provided good support for the MTQ48.

There are two technical aspects of test design that will be discussed in this chapter: reliability and validity.


Reliability is the foundation of the usefulness of a test. If the reliability of a test is too low it cannot be used to explain or predict behaviour. The variation within the test simply swamps the variations between individuals. If a test had perfect reliability it would always give exactly the same score. This is never achieved. If the reliability was zero it would mean that test scores were simply random. In reality we need a test that falls somewhere between these two extremes.

The most common approach to assessing test reliability is internal consistency. This looks at the way the individual items relate to each other. Items written to measure one aspect of mental toughness, for example challenge, should relate to the other items in that scale more strongly than they relate to items written to assess other aspects of mental toughness. Perhaps the “gold standard” for this type of analysis is Cronbach's alpha. A score of 0.7 or above is widely accepted as the quality threshold.

Table 1 shows the initial reliabilities of the scales produced when the test was first developed.

It can be seen that all sub scales reached the minimum acceptable level. This supports the homogeneity of each sub scale and the MTQ48 as a whole.

Following on from this initial work the overall internal consistency of the MTQ48 has repeatedly been found to be satisfactory in a number of published research articles. (e.g., Kaiseler, Polman & Nicholls, 2009; Dewhurst, Anderson, Cotter, Crust & Clough, 2012). Although there is clearly some variations in the scale reliabilities in the published literature, especially in relation to the emotional control scale, taking them “in the round” it is clear that the test reaches satisfactory levels of reliability.

Data relating to test-retest is also positive, but less common. Test-retest reliability looks at changes in the test scores. It is clear that mental toughness scores can change, but it is important these changes are attributable to some form of intervention or identifiable action. Random fluctuations would reduce the predictive power

Table 1. Initial scale reliabilities of the Mental Toughness Questionnaire 48.

of the questionnaire. In a study carried out with Hull 108 students the test-retest reliability as measured by Pearson's correlation coefficient, was high for all scales, with a range from 0.80 for challenge to 0.87 for emotional control. The sample was tested and retested at a six-week interval.


Basically validity relates to the usefulness of a test. There are many ways of doing this but four of the main forms of validity will be discussed in this chapter.

Face validity

Face validity relates to the “feel” of the questionnaire. Simply put— does it look right? A test of mental toughness should appear to be a test of mental toughness. The MTQ48 was carefully designed to have appropriate items, have a relatively low reading age and have a simple rating system. The MTQ48 is acceptable to a wide range of people and there are seldom any issues about its applicability. End users find it appropriate and can understand the items. Very young children find it difficult but we have found no issues using it with secondary school students and older. The questionnaire has been translated into many languages and few problems have been identified.

