Desktop version

Home arrow Mathematics

  • Increase font
  • Decrease font


<<   CONTENTS   >>

Do Interviewer Behaviors Predict Response Quality?

To what extent are these differences in coded behaviors connected with the improved data quality (less rounding, less straight-lining, more disclosure) in texting observed in this data set? The modes in which respondents disclosed more (text vs. voice, and automated vs. interviewer-administered interviews) and rounded less (text vs. voice) differ systematically in the various interviewer behaviors listed in Tables 13.2 and 13.3, and it is plausible that these differences are part of the bundle of differing features that led to the data quality differences. But texting and voice, of course, differ systematically on multiple fronts: in synchrony, in the medium (written vs. spoken), in the persistence and reviewability of the messages exchanged, in the potential impact of nearby others, and in how easy it is to multitask during the interaction (see Schober, et al. 2015, Table 1). Any or all of those features in combination might cause the observed patterns of precise responding and disclosure.

Using our first analytic strategy, among the behaviors we see as particularly demonstrating interviewers' humanness, fillers, repairs, and laughter only ever occurred in Human Voice interviews. Our mode differences in disclosure showed two independent effects: increased disclosure in text (vs. voice) interviews, and in automated (vs. human) interviews. Fillers, repairs, and laughter are, then, plausible (or at least not ruled out as potential) interviewer behaviors that correlate with reduced disclosure. For rounding, however, we only observed reduced rounding in text (vs. voice), and no effect of automation. Given this pattern, interviewer fillers, repairs, and laughter are unlikely to be connected with the greater precision in text vs. voice.

Following this same logic, the interviewer behaviors that were especially more frequent in Human Voice interviews relative to all the other modes (see Online Appendix 13C) also are plausible candidates for correlating with reduced disclosure - e.g., asking a question with a wording change or paraphrase, restating the response alternatives, presenting a neutral probe, continuing a previous move after a change of speakers, narrowing the response options, and providing nonstandard commentary - but not with the rounding effects. Interviewer behaviors that were particularly frequent only in voice interviews but not in text interviews (e.g., acknowledgments like "got it") can potentially contribute to the findings on rounding, which only found a voice vs. text difference. The findings in Online Appendix 13C can therefore be seen as providing a profile of which interviewer behaviors are potential contributors to the data quality findings.

Using our second analytic strategy, which focuses on the interviewer-administered modes where these "human touch" behaviors occurred, we carried out regression analyses to examine whether laughter, commentary, and disfluencies (fillers, repairs, pauses judged as notable by our coders) were linked with data quality (disclosure and rounding), as well as with respondent satisfaction with the interview. Focusing first on disclosure, the first two columns of data in Table 13.4 show that when voice interviewers produced more fillers for the 13 questions we used to measure disclosure,[1] [2] respondents were significantly more likely to disclose more sensitive information?

Disclosure

Rounding

Satisfaction

Voice

Text

Voice

Text

Voice

Text

Predictor

Est. (SE) p-value

Est. (SE) p-value

Est. (SE) p-value

Est. (SE) p-value

Est. (SE) p-value

Est. (SE) p-value

Intercept

  • 2.778
  • (0.348)

<.001

  • 3.234
  • (0.181)

<.001

  • 3.956
  • (0.212)

<.001

  • 3.203
  • (0.138)

<.001

  • 2.717
  • (0.101)

<.0001

  • 2.499
  • (0.056)

<.0001

Commentary

  • -0.098
  • (0.147)

.507

  • 0.045
  • (0.185)

.808

  • 0.332
  • (0.202)

.103

  • -0.144
  • (0.163)

.378

  • -0.062
  • (0.042)

.150

  • 0.027
  • (0.057)

.634

Laughter

  • -0.151
  • (0.307)

.623

  • -0.437
  • (0.189)

.022

  • 0.029
  • (0.089)

.748

Repairs

  • -0.072
  • (0.146)

.620

  • 0.190
  • (0.143)

.186

  • -0.070
  • (0.042)

.098

Fillers

  • 0.296
  • (0.127)

.022

  • -0.066
  • (0.113)

.560

  • 0.013
  • (0.037)

.723

Pauses

  • 0.011
  • (0.074)

.886

  • 0.046
  • (0.068)

.501

  • -0.004
  • (0.021)

.857

Estimated variance of random interviewer effect

0.191

NAa

NAa

NAa

0.016

NAa

N interviewers

8

8

8

8

8

8

N respondents

148

156

148

156

148

156

NAa: random intercept was omitted because its estimated variance was less than zero when included in the model. Interviewer behaviors are included in the models as continuous variables. (When the behaviors are included in the model as binary predictors corresponding to whether a behavior ever occurred, the pattern of results is similar, with two exceptions as noted in the text.)

We see this finding as suggestive rather than definitive, in that further analyses show a less clear story: (a) the estimated effect of fillers on disclosure in voice interviews is no longer statistically significant if we treat the predictor as binary rather than continuous (coefficient = 0.304, p = .364); (b) when all interviewer disfluencies (repairs, fillers, pauses) are combined into one variable (not shown in Table 13.4), this combined variable shows no significant effect on disclosure (coefficient=0.065, p = .207); and (c) if we log-transform our measure of disclosure to address the right-skew in the data, the estimated effect of fillers on disclosure in voice interviews is no longer statistically significant (p = .165). On the other hand, when fillers are added to the model as a binary variable (low: 0-2 fillers vs. high: 3 or more fillers), the estimated effect of three or more fillers is positive and significant (1.389, p = .016). So we (only very cautiously) interpret these findings as consistent with the possibility that one kind of evidence of interviewer fallibility - producing more urns and uhs - may increase willingness to disclose embarrassing information.

What about precise responding? As the middle data columns in Table 13.4 show, based on respondents' answers and interviewer behaviors in the eight question-answer sequences in voice interviews that we used to measure rounding, respondents provided more answers that were precise (fewer rounded answers) when interviewers laughed more. We see no evidence that interviewers' disfluencies affected precise responding, either individually or as a combined measure of repairs, fillers, and pauses (0.045, p = .367), nor that commentary affected precise responding. While one might think that an interviewer's laughter, as an indicator of informality, might license respondents to answer more casually and less thoughtfully, that is clearly not the case here; the evidence is instead consistent with an account that an interviewer's laughter may correlate with a feeling of reduced time pressure, perhaps giving respondents more time to retrieve instances and formulate their answers.

And what about respondent satisfaction? Perhaps linguistic evidence of interviewers' humanness makes respondents feel more comfortable, whether or not they disclose more or try harder to produce precise answers. As the final two data columns in Table 13.4 show, after voice interviews where interviewers produced more repairs (across all 32 question-answer sequences in the entire interview), respondents reported marginally less satisfaction with the interview in the post-interview online debriefing questionnaire (1 =very dissatisfied/somewhat dissatisfied; 2 = somewhat satisfied; 3=very satisfied). This marginal effect becomes statistically significant if we treat the predictor as binary (presence or absence of repairs) (estimate -0.262, p = .010). Although we don't know why repairs (rather than fillers or a combined disfluency rate) should be particularly predictive, this finding suggests that interviewer disfluency either leads to respondent dissatisfaction - perhaps disfluent interviewers seem unprepared or unprofessional - or (conversely) that speech goes less smoothly in interviews that respondents are enjoying less. While analyses of telephone interview invitations (Conrad, et al. 2013) show that interviewers who are too perfectly fluent can sound "robotic" and off-putting, too much disfluency also can reduce agreement to participate. The evidence on satisfaction here is consistent with that finding.

While text interviews didn't provide the same robust opportunities for us to observe effects of interviewer humanizing behaviors on respondent data quality, we did have access to one aspect of the interactional dynamic in text interviews: the speed of back- and-forth. Some text interviews were close to synchronous, with more rapid-fire turn exchanges, while others featured longer delays between turns. (In most cases we attribute these speed differences to respondent speed; the automated text interviewing system always produced the next turn immediately, and interviewers were generally speedy in responding.) Although we didn't observe any effects of text interview speed on respondent disclosure or their satisfaction with the interview, we do see a consistent effect on rounding (reported in Schober, et al. 2015): respondents gave significantly fewer precise (more rounded) answers in faster interviews (interviews with shorter inter-turn intervals than the median of 15.75 seconds), and more precise answers in slower interviews. The direction of causality is of course unclear, but the clear link between interview speed and precise responding is that respondents who take longer to respond (perhaps because they are thinking harder about accurate answers or looking up records) give more precise answers.

Taken together, these additional analyses of the effects of interviewer behavior and texting speed lead us to a preliminary account. Part of what promotes precise responding in text (vs. voice) is the reduced time pressure to respond (evidenced by less rounding in slower text interviews), and a more relaxed less-time-pressured mode of interviewing in voice interviews (evidenced by greater precision in voice interviews when interviewers laughed more). Disclosure increases significantly in the modes that reduce interviewer social presence (text vs. voice, and automated vs. interviewer-administered), but within the interviewer-administered telephone interviews disclosure may be promoted by increased evidence of interviewer fallibility or humanness (more fillers). In contrast, increases in another kind of evidence of interviewer fallibility - repaired speech - corresponded with decreased respondent satisfaction with voice interviews.

  • [1] * Focusing the analyses only on interviewer behaviors in these 13 question-answer sequences runs the riskof missing potential effects of these behaviors that accumulate from prior or intervening question-answersequences, but provides what we see as the cleanest test.
  • [2] Note that four interviewer behaviors - laughter, repairs, fillers, and pauses - did not occur in human textinterviews, and so can't be predictors in any of the models of Human Text interviews.
 
<<   CONTENTS   >>

Related topics