Interviewers are integral to the success of face-to-face surveys through the administration of questions during the interview. During the question-and-answer process, interviewers may positively reduce bias by clarifying concepts and probing responses effectively (Conrad and Schober 2000; Schober and Conrad 1997). Conversely, interviewers can negatively impact the variability of estimates through differences in how they ask questions and probe inadequate responses (Kish 1962). The interviewer effect, or the increase in variance of a sample statistic that is attributable to interviewers, is one way to measure this influence on survey responses. This increase in variance in the resulting survey estimate reduces the ability to detect true differences between groups of respondents. In this chapter, we used multilevel models to estimate interviewer effects on several NHIS outcomes. We also investigated the extent to which interviewer effects vary by question and interviewer characteristics.
Question characteristics were shown to be associated with interviewer effects. Questions on complex topics, long questions, and relatively more difficult questions to read (as measured by Flesch-Kincaid reading ease score) were associated with larger interviewer effects than questions on less complex topics, shorter questions, and relatively easy questions to read. Questions with these characteristics are likely to be more difficult for interviewers to administer, leading to more requests for clarification by survey respondents as well as more interviewer intervention to obtain adequate answers. In other words, difficult questions consistently introduce interviewer effects (West and Blom 2017).
Interviewer characteristics are also associated with interviewer effects. Interviewers who conduct the interview at a faster pace tend to have larger interviewer effects compared to those who conduct the interview at a slower pace. This finding suggests that having interviewers slow down when reading questions may be beneficial for data quality. We also found that interviewers with the highest cooperation rates tend to have the lowest interviewer effects. In one of the few prior studies to address the interaction between interviewer cooperation rates and interviewer variance, Brunton-Smith, Sturgis, and Williams (2012) identified a curvilinear relationship, whereby interviewers with the lowest and highest cooperation rates produced the largest interviewer effects. While more research in this area is necessary, our results suggest that high-performing interviewers can master the tailoring and improvisational skills needed to counter respondent reluctance and gain cooperation at the doorstep and adhere to a standardized interviewing protocol once the interview starts.
This research is not without limitations. In order to study interviewer effects in an ongoing survey like the NHIS, we used multilevel models to isolate the effect of the interviewer from the area in which the interviews are conducted. Although this is a commonly used approach, given the cost of implementing interpenetrated samples, it is possible that we have not adequately adjusted for respondent and area effects, leading to a misallocation of some of the variance in our models. In addition, we reported simple bivariate results for the relationships of question and interviewer characteristics with interviewer effects. The reported findings may differ if the simultaneous effects of question and interviewer characteristics on interviewer effects are explored. (For a more thorough description of statistical models that would enable such analyses, see Loosveldt and Wuyts, Chapter 22 in this volume). Finally, we were limited to interviewer measures that could be constructed from the available survey data. While the findings of past research are somewhat mixed (see West and Blom 2017), observable (e.g., age, sex, and race) and unobservable (e.g., attitudes about interviewing) characteristics of interviewers may also play an important role in explaining the effects of interviewers on NHIS estimates.
In conclusion, we found features of questions and characteristics and behaviors of interviewers to be significantly associated with interviewer effects on NHIS estimates. In addition to understanding the relative influence of these characteristics in a multivariable context, future research could leverage other methods to help isolate the underlying causes of the interviewer effects we observed. For example, behavior coding of computer- assisted recorded interviews (CARI) could clarify the interviewer or respondent behaviors, as well as question features, which lead to interviewer effects. In addition, future work could focus on how best to use results from the multilevel models described herein to identify interviewers with the largest impact on IICs (hence, as a tool for monitoring interviewer performance). Following a method described by Kreuter (2002), for example, IICs would be estimated for a subset of questions for the entire set of interviewers. Next, the IICs would be re-estimated dropping all the cases from one interviewer from the data set. If the IIC drops significantly for a question when an interviewer's cases are removed, that interviewer has a substantial impact on the IIC and the interviewer effect for that item. The interviewer in question would be flagged for follow-up and possible re-training. Another approach would be to identify interviewers with extreme predicted values of random effects (EBLUPs) during a given period of data collection and follow up with those interviewers to understand if they are struggling with particular questions.
Bell, B. A., G. B. Morgan, J. A. Schoenberger, J. D. Kromrey, and J. M. Ferron. 2014. How low can you go? An investigation of the influence of sample size and model complexity on point and interval estimates in two-level linear models. Methodology 10(1):1—11.
Beretvas, S. N. 2010. Cross-classified and multiple-membership models. In: Handbook of Advanced Multilevel Analysis, ed. J. J. Hox, and J. K. Roberts, 313-334. New York: Routledge.
Brunton-Smith, I., R Sturgis, and J. Williams. 2012. Is success in obtaining contact and cooperation correlated with the magnitude of interviewer variance? Public Opinion Quarterly 76(2):265-286.
Conrad, F. G., and M. F. Schober. 2000. Clarifying question meaning in a household telephone survey. Public Opinion Quarterly 64(1):1—28.
Dahlhamer, J. M., A. Maitland, H. Ridolfo, A. Allen, and D. Brooks. 2019. Exploring the associations between question characteristics, respondent characteristics, interviewer characteristics and survey data quality. Pp. 153-192 in: Advances in Questionnaire Design, Development, Evaluation and Testing, ed. P. Beatty, D. Collins, L. Kaye, J. L. Padilla, G. Willis, and A. Wilmot. New York: Wiley.
Davis, R. E., M. P. Couper, N. K. Janz, С. H. Caldwell, and K. Resnicow. 2010. Interviewer effects in public health surveys. Health Education Research 25(1):14—26.
Flesch, R. 1948. A new readability yardstick. The Journal of Applied Psychology 32(3):221-233.
Groves, R. M., and M. P. Couper. 1998. Nonresponse in Household Interview Surveys. New York: Wiley.
Hox, J. J., E. D. de Leeuw, and I. G. G. Kreft. 1991. The effect of interviewer and respondent characteristics on the quality of survey data: A multilevel model. In: Measurement Errors in Surveys, ed. P. Biemer, R. M. Groves, L. E. Lyberg, N. A. Mathiowetz, and S. Sudman, 439M61. New York: Wiley.
Kish, L. 1962. Studies of interviewer variance for attitudinal variables. Journal of the American Statistical Association 57(297):92-115.
Koons, D. A. 1973. An Experimental Comparison of Telephone and Personal Health Interview Surveys Vital and Health Statistics Series 2, No. 54. Hyattsville: National Center for Health Statistics.
К re u ter, F. 2002. Kriminalitatsfurcht: Messung und methodische probleme. Berlin: Leske and Budrich.
Maas, C. J. M., and J. J. Hox. 2005. Sufficient sample size for multilevel modeling. Methodology l(3):86-92.
Mahalanobis, P. C. 1946. Recent experiments in statistical sampling in the Indian statistical institute. Journal of the Royal Statistical Society 109:325-378.
Mangione, T. W., F. J. Fowler, Jr., and T. A. Louis. 1992. Question characteristics and interviewer effects. Journal of Official Statistics 8(3):293-307.
National Center for Health Statistics. 2018. 2017 National Health Interview Survey (NHIS) Public Use Data Release: Survey Description. Hyattsville: National Center for Health Statistics.
Olson, K., and A. Peytchev. 2007. Effect of interviewer experience on interviewer pace and interviewer attitudes. Public Opinion Quarterly 71(2):273-286.
Pickery, J., and G. Loosveldt. 2001. An exploration of question characteristics that mediate interviewer effects on item nonresponse. Journal of Official Statistics 17(3):337-350.
Schaeffer, N. C, J. Dykema, and D. W. Maynard. 2010. Interviewers and interviewing. In: Handbook of Survey Research, 2nd Edition, ed. P. V. Marsden, and J. D. Wright, 437-470. Bingley: Emerald Group Publishing Limited.
Schnell, R., and F. Kreuter. 2005. Separating interviewer and sampling-point effects. Journal of Official Statistics 21(3)589^110.
Schober, M. E, and F. G. Conrad. 1997. Does conversational interviewing reduce survey does conversational interviewing reduce survey measurement error? Public Opinion Quarterly 61(4)576-602.
Snijders, T. 2005. Power and sample size in multilevel linear models. In: Encyclopedia of Statistics in Behavioral Science, volume 3, ed. B. S. Everitt and D. C. Howell, 1570-1573. Hoboken: John Wiley & Sons.
Snijders, T, and R. Bosker. 1999. Multilevel Analysis. London: Sage.
West, В. T., and A. Blom. 2017. Explaining interviewer effects: A research synthesis. Journal of Survey Statistics and Methodology 5:175-211.