Research on score differences related to national culture and measurement
Much more research is needed to explore the magnitude of score difference between national culture groups and immigrant and non-immigrant groups on a wider range of commonly used predictors. The work to date has primarily focused on differences in measures of cognitive ability and personality. Even this research has been limited to a handful of countries. Despite calls to increase our understanding of the role of culture in score differences (e.g., Helms-Lorenz et al., 2003), little research in the area of personnel assessment has heeded these calls. Future research and theoretical work are needed that explores the cultural factors responsible for these differences. Brouwers and van de Vijver (2015) provide one example of a direction this research could take. They argue that personnel assessment will not advance unless a contextualized view of assessing individual differences is taken. This is consistent with van de Vijver’s (1997) conclusion that locally developed assessments (i.e., contextualized) tend to show smaller score differences than the typical Western-style assessment (i.e., decontextualized). They propose an assessment strategy that can take a more contextualized approach which explicitly allows culture into the process of measuring individual differences.
The cross-cultural and cross-national research reviewed here points to the need to understand the degree to which tests and assessments contain culturally-specific or non-domainrelevant content, particularly as it relates to linguistic content. The inclusion of this content considerably raises the likelihood of admitting a source of contamination into the measurement process. In the domain of personnel assessment this is particularly important given that tests may be used on a global scale and used with culturally diverse populations within a country. In these cases, a lack of familiarity with content that is not related to the construct of interest can create substantial problems for accurately assessing individuals and interpreting test scores. Although best practice recommendations cover many aspects of test content when applied to the development or adaptation of tests for use globally (e.g., Byrne et al., 2009; Ryan & Tippins, 2009), these same recommendations are much less frequently applied when tests are developed for domestic use with globally diverse populations.
Another area for future research is investigating the role that measurement plays in the observed score differences. This research has many possible avenues. One is further exploring how test content requiring previously acquired knowledge that is not related to the test domain impacts test performance. Although the initial work offers some insight (Fagan & Holland, 2002, 2007), more research is clearly needed. Another avenue for research is further examining ways to create test times that minimize cultural and nondomain-relevant content. One promising direction is research using items containing nonentrenched tasks (e.g., Sternberg, 1981a, 1981b, 1982a; Sternberg & Gastel, 1989; Tetewsky & Sternberg, 1986). Non-entrenched tasks are those that use novel or atypical stimuli or concepts to solve problems. The core feature of non-entrenched items is that they do not represent the natural state of problems or stimuli in everyday life (Sternberg, 1982b). For example, Sternberg (1981a) described a number of non-entrenched tasks including one for which individuals need to determine the physical state of an object (e.g., liquid or solid) and the object’s fictional name (e.g., plin, kwef) as it moves from north to south or south to north on the fictional planet Kryon from a set of rules presented at the start of the task. Although the initial work suggests that the use of these items can reduce score differences (e.g., Sternberg, 2006), more work is needed to better understand how these items work and whether there are boundary conditions on their effectiveness.
More generally, research is needed to articulate the knowledge structures, cognitive process and cognitive strategies that are required to solve a test problem, as well as understand how these processes lead to items being more or less difficult for test-takers (Embretson, 1983). On a related note, theoretical work has also been devoted to understanding how stimulus features of items contribute to item difficulty and impact item performance (e.g., Irvine & Kyllonen, 2002; Lievens & Sackett, 2007).