Desktop version

Home arrow Mathematics

  • Increase font
  • Decrease font


<<   CONTENTS   >>

Data and Methods

Data

This study uses data from Wave 3 of the Understanding Society Innovation Panel (IP), which is a panel used for methodological research to inform the design of the

Understanding Society household panel study in the UK (Jackie, et al. 2017). The IP uses a multi-stage probability sample (for more information on sample design, see Jackie, et al. 2017). For Wave 3, 1,526 eligible households were identified and 1,027 household interviews were completed with a response rate of 67% (AAPOR RR1). All eligible adults (age 16+) in the household were selected to complete an individual, face-to-face, computer-assisted personal interview (CAPI) in which interviewers were instructed to read all questions verbatim. Conditional on the household response rate, the individual response rate was 82%, for a total of 1,621 completed interviews. Average interview length was 37.5 minutes. Selected sections[1] [2] of the interview were audio-recorded with the permission of the respondent (72% consent rate). However, due to procedural and technical difficulties, only 820 interview recordings were available for analysis. The instrument was programmed in Blaise and the timing file contained timestamps for all interviews.

Behavior coding was used to determine which questions were read verbatim and which were misread. These data are used as the "gold standard" that the QATT deviation detection methods will be tested against for accuracy. To select a subset of the recorded files for behavior coding, two interviews were randomly selected from each of the 80 interviewers. In a few cases, the selected interviews were missing recordings at the section level, resulting in only a few recorded questions in the interview. When this happened, an additional interview was randomly selected from the same interviewer to ensure that each interviewer had at least 50 questions that could be coded.[2] This procedure yielded 168 interviews selected for behavior coding. Within the selected interviews, 402 questions were selected for analysis based on the following criteria:

  • • Question was intended to be read out loud
  • • Did not contain "fills"
  • • Were administered to both males and females
  • • Had one-to-one matching with timing file questions (i.e., did not loop)
  • • Had the same response options for all regions

Due to question routing, not all questions were administered to all respondents. The total sample size for coding and analysis is 10,386 question administrations.

The behavior coding was done directly from the audio files (no transcription) by a single coder. The coding builds on Cannell, Lawson, and Hausser's (1975) behavior coding scheme. The interviewer's first reading of each question was coded as (a) question read verbatim, (b) contains only minor deviations, or (c) contains at least one major deviation. Building on Cannell, Lawson, and Hausser (1975), explicit rules were created to evaluate if the deviation was minor or major (see Appendix 19A) with the primary distinction being the assumption that minor deviations most likely do not change the meaning of the question, but major deviations are likely to change the meaning of the question. Coding results show 34.5% of questions had minor deviations and 13.0% of questions had major deviations.

Dependent Variable and Variables for QATT Detection Methods

Because minor deviations most likely do not change the meaning of the question, the focus in this chapter is on how to best detect major deviations. Thus, the dependent variable, derived from the behavior codes, is coded as 0=Verbatim/Minor Deviation and 1 = Major Deviation.

Next, variables for each of the QATT detection methods were created. First the WPS Point estimate thresholds were calculated at 2 WPS, 3 WPS, and 4 WPS by dividing the total number of words in the question text (not including optional text) by 2, 3, and 4, respectively. Any question duration that was faster than (i.e., below) the point estimate was flagged as a possible deviation in which words may have been omitted. Next, the WPS range thresholds (2-3 WPS; 2-4 WPS; 1-3 WPS; 1-4 WPS) were calculated using the same procedure. For example, the lower and upper thresholds for a ten-word question at 2-3 WPS would be 3.33 seconds (10 divided by 3) and 5 seconds (10 divided by 2), respectively. Question durations lower than the lower threshold or higher than the higher threshold were flagged as possible major deviations. Lastly, the QATT thresholds based on standard deviations were calculated. The mean and standard deviation of the duration for each question were calculated across all interviewers. Four different sets of thresholds were then set at ±0.5, ±1.0, ±1.5, and ±2.0 standard deviations from the mean question duration, and durations outside these thresholds were flagged as possible deviations.

  • [1] * A series of experiments (question wording, branching, and show card) were carried out in IP Wave 3 and thecorresponding questionnaire sections were recorded (Jackie, et al. 2017).
  • [2] This data set is used in multiple studies, including examinations of question characteristics and interviewereffects. To increase analytic power, a minimum of 50 questions per interviewer was established.
  • [3] This data set is used in multiple studies, including examinations of question characteristics and interviewereffects. To increase analytic power, a minimum of 50 questions per interviewer was established.
 
<<   CONTENTS   >>

Related topics