Desktop version

Home arrow Mathematics

  • Increase font
  • Decrease font

<<   CONTENTS   >>

Response Times as an Indicator of Data Quality: Associations with Question, Interviewer, and Respondent Characteristics in a Health Survey of Diverse Respondents


Response time (RT) - the time elapsing from the beginning of question reading for a given question until the start of the next question - is a potentially important indicator of data quality that can be reliably measured for all questions in a computer-administered survey using a latent timer (i.e., triggered automatically by moving on to the next question).’1' In interviewer-administered surveys, RTs index data quality by capturing the entire length of time spent on a question-answer sequence, including interviewer question-asking behaviors and respondent question-answering behaviors. Consequently, longer RTs may indicate longer processing or interaction on the part of the interviewer, respondent, or both.

RTs are an indirect measure of data quality; they do not directly measure reliability or validity, and we do not directly observe what factors lengthen the administration time. In addition, either too long or too short RTs could signal a problem (Ehlen, Schober, and Conrad 2007). However, studies that link components of RTs (interviewers' question

* RTs are distinct from response latencies (RLs). RLs measure time from the end of question reading to the respondent's answer. RLs have been shown to be associated with, for example, response accuracy (Draisma and Dijkstra 2004) and task difficulty (Garbarski, Schaeffer, and Dykema 2011).

reading and response latencies) to interviewer and respondent behaviors that index data quality strengthen the claim that RTs indicate data quality (Bergmann and Bristle 2019; Draisma and Dijkstra 2004; Olson, Smyth, and Kirchner 2019). In general, researchers tend to consider longer RTs as signaling processing problems for the interviewer, respondent, or both (Couper and Kreuter 2013; Olson and Smyth 2015; Yan and Olson 2013; Yan and Tourangeau 2008).

Previous work demonstrates that RTs are associated with various characteristics of interviewers (where applicable), questions, and respondents in web, telephone, and face-to-face interviews (e.g., Couper and Kreuter 2013; Olson and Smyth 2015; Yan and Tourangeau

2008). We replicate and extend this research by examining how RTs are associated with various question characteristics and several established tools for evaluating questions. We also examine whether increased interviewer experience in the study shortens RTs for questions with characteristics that impact the complexity of the interviewer's task (i.e., interviewer instructions and parenthetical phrases). We examine these relationships in the context of a sample of racially diverse respondents who answered questions about participation in medical research and their health.

Response Times and Question Characteristics

Questions vary in many ways, including their structural features (e.g., number of words or clauses), difficulty (e.g., readability level), response format (e.g., yes/no, ordinal rating scale, open response), topic, and content (Dykema, et al. 2019). RTs have been shown to be related to several question characteristics, including question type (e.g., events and behaviors vs. evaluations), question length, response format, inclusion of instructions, presence of ambiguous terms, and use of fully vs. partially labeled response categories (e.g., Couper and Kreuter 2013; Olson and Smyth 2015; Yan and Tourangeau 2008). Studies of RTs and question characteristics are largely based on observational approaches (see review in Dykema, et al. 2019) in which researchers make use of a survey conducted for another purpose, code specific characteristics of the questions in the survey, and examine the association of those characteristics with RTs. The characteristics examined vary across studies as a function of the types of questions available in the questionnaire and researcher interests. Replication across surveys, topics, and populations is critically important, given that many question characteristics are study-specific and collinear (Schaeffer and Dykema forthcoming).

In this chapter, we examine the association between RTs and question characteristics available in our own observational study. Table 18.1 provides the list of question characteristics and hypotheses. We base our hypotheses on relationships demonstrated in previous research and expectations about whether the characteristic is likely to increase the cognitive processing burden of the respondent, interviewer, or both. Some hypotheses are evident; others require explication. See Online Appendix 18A for background and justification regarding Hla-Hll. We formulate hypotheses under the assumption that other question characteristics are held constant.

In addition to the individual or "ad hoc" question characteristics described above, we also examine the association of several established question evaluation tools with RT, including the Flesch-Kincaid grade level, the Question Understanding Aid (QUAID; Graesser, et al. 2006), the Question Appraisal System (QAS; Willis 2005; Willis and Lessler 1999), and the Survey Quality Predictor (SQP; Saris and Gallhofer 2007) (see Online Appendix 18B). Each tool identifies multiple question characteristics that may be problematic for

TABLE 18.1

Hypotheses about the Effect of Question Characteristics on Response Times


Question Characteristic

Effect on RTs


Number of words



Question order



Question type

Demographics < events/behaviors «subjective


Question form

Yes/no < unipolar ordinal, bipolar ordinal, nominal, discrete value


Definition in the question



List-item question



Sensitive question



Race-related question



Battery structure

First in battery > later; First in series>later


Emphasis in the question



Interviewer instructions



Parenthetical phrases


H2a, 3a

Flesch-Kincaid grade level


H2b, 3b

QUAID problem score


H2c, 3c

QAS problem score


H2d, 3d

SQP quality score



Interaction of number of interviews by interviewer instructions


Interaction of number of interviews by parenthetical phrases

Notes: Hla-Hll and H3a-H3d are net of the effects of other question characteristics; H2a- H2d are for bivariate relationships.

respondents or interviewers, and the tools can be used to code questions and characteristics from any type of survey. Although the tools differ in their implementation and scope, they can be used to produce a question-level "problem" or "quality" score that indicates the complexity of the question. We expect that more complex questions (as indicated by scores from the established tools) are associated with longer RTs because they are harder for interviewers to read and harder for respondents to answer (Table 18.1 H2a to H2d; H2d is negative because a higher SQP quality score indicates less complexity). Consistent with expectations, Olson and Smyth (2015) reported that questions with higher reading levels (harder to read) took longer to administer. We are not aware of studies that examine the relationship between the other tools and RTs. (Yan and Tourangeau [2008] examined the relationship between individual question characteristics and QUAID, but they did not include QUAID as a predictor of RT.)

Coding individual question characteristics and generating scores using the established tools is time-consuming and can be costly. Thus, whether the individual characteristics and scores from established tools each independently account for variance in RTs or are duplicative of each other is of interest. We evaluate this by examining whether scores from the established tools predict RTs net of individual question characteristics (H3a to H3d in Table 18.1): although some aspects of the characteristics that are coded to produce these scores overlap with individual question characteristics (e.g., question length), they also incorporate features beyond the individual characteristics with potential implications for RTs.

Response Times and Interviewers’ Experience

An important interviewer characteristic to consider in predicting RTs is the interviewer's level of experience. Interviewers appear to increase their pace within an interview (as they gain experience with an individual respondent), within a study (as they gain experience with the particular questionnaire), and across studies (as they become more experienced in general). Their faster speed may be because they develop shortcuts (e.g., alter questions or decrease standardized practices), become more fluent, head-off problems, and so forth (Bergmann and Bristle 2019; Bohme and Stohr 2014; Holbrook, et al. Chapter 17; Kirchner and Olson 2017; Olson and Peytchev 2007; Olson and Smyth Chapter 20).

In this chapter, we are primarily concerned with within-study experience (i.e., the number of interviews interviewers have conducted). Previous research indicates that the time to complete an entire interview (the aggregate of RTs) and interviewer reading times decrease with the number of interviews completed for a given study (Bergmann and Bristle 2019; Kirchner and Olson 2017; Loosveldt and Beullens 2013; Olson and Peytchev 2007), particularly for inexperienced interviewers (Olson and Peytchev 2007), and accounting for changes in the types of respondents interviewers encounter over the course of the field period (Kirchner and Olson 2017).

We propose that interviewer experience interacts with question characteristics that primarily impact interviewers' task complexity (Olson and Smyth 2015) in predicting RTs because these are the characteristics for which interviewers have the most discretion. In this study, these question characteristics include interviewer instructions and parenthetical phrases. As they complete interviews and become more familiar and comfortable with the questionnaire, we expect interviewers will decrease their attention to and reading of interviewer instructions and be less likely to incorporate discretionary parenthetical phrases. Thus, with increasing interviewer experience (more interviews completed), RTs will decline more rapidly for questions with instructions or parenthetical phrases than without them (H4a and 4b; Table 18.1).

<<   CONTENTS   >>

Related topics