With random sampling, every member of a population has an equal chance of being invited. Yet no sample is perfect. Everyone drawn into a sample cannot be reached. Everyone who is reached does not agree to participate. What sorts of bias are introduced? How can we know?
Unlike some other deliberative microcosms, the procedure in Deliberative Polling is to conduct the initial interview on first contact and then invite. This seemingly obvious and straightforward strategy produces attitudinal as well as demographic data comparing participants and non-participants. Our experience has been that there are usually very few statistically significant differences between participants and non-participants and that when they occur they are usually substantively small. This is partly because, unlike some other projects, we pay incentives for participation and employ every means possible to attract those initially drawn into the sample.
Consider the very first Deliberative Poll in Britain in 1994 on the issue of criminal justice policy. There were 869 completed initial interviews, which represented a response rate of 74 per cent. Of these, 301 attended the event, permitting attitudinal and demographic comparisons of the 301 with the 568 non-participants. We asked 102 questions, both demographic and attitudinal, in the initial questionnaire, and only fourteen showed statistically significant differences between participants and non-participants. Furthermore, most of the differences, even when statistically significant, were substantively small. While the participants were slightly more knowledgeable than the non-participants (between 7 and 11 per cent more likely to know the right answer on a battery of knowledge questions) we could truly say that we had gathered all of Britain to one room (Luskin et al. 2002).
To see the difference, consider some other efforts to apply something like random sampling to public consultation. The attraction of random sampling is that it can provide a basis for establishing a microcosm of the entire community. But everything depends on how it is done. An example of the sort of project that appears to employ random sampling but actually fails to provide any credible basis for its evaluation is the attempt by a group called ‘America Speaks’ to substitute ‘random sampling’ for its normal recruitment process of sheer selfselection (combined with a demographic screen selecting only some of the people who volunteer themselves). In the case of its project on health care in Maine, it sent out 75,000 postcards to randomly chosen residents in order to recruit a forum of a few hundred. They were asked to indicate interest in attending a deliberative forum about health care by sending in a response card with their demographic characteristics. Only 2,700 returned the cards and after some demographic screens were applied to these, 300 participants came on the day. Setting aside the fact that this ‘sample’ was supplemented by others who were recruited by stakeholder groups to make up for low numbers of young people and minorities, this design gives no confidence in any claims to representativeness. There is no data comparing the attitudes of the 2,700 who volunteered themselves with the 75,000, and no data comparing the 300 with the 2,700 or the 75,000. It is important to note that unlike Deliberative Polls, the participants in ‘America Speaks’ are not compensated for their time and effort. They just have to be sufficiently motivated about the issue to want to spend a whole day discussing it. Since most people are not motivated to spend much time and effort pondering policy, it seems obvious that the 300 or so who volunteered themselves from an initial list of 75,000 would not offer a credible microcosm of the views of the entire public.4
The lesson here is that the mere invocation of ‘random sampling’ is not enough to ensure representativeness. Everything depends on how it is done, what data is collected at what point and what incentives or other motivations are employed to try and attract - and enable - those initially drawn in the sample to show up. When the response rate is miniscule and there are no incentives, an initial effort at random sampling can easily transform into virtually pure self-selection.
To a lesser degree some of the same difficulties applied to the now famous Citizens’ Assembly in British Columbia. A stratified random sample of 23,034 was invited via letter. 1,715 responded saying they were interested. After some demographic criteria were applied, 1,441 of these were invited to come to ‘selection meetings’. 964 did so, and 158 of these were selected randomly. The issue is that we do not have any way of evaluating how the 1,715 who selected themselves compared to the initial pool of23,034 (Citizens’ Assembly on Electoral Reform 2004). How much more interested or knowledgeable about politics and public affairs, how much more skewed to one political viewpoint or another were they? Similarly we do not know anything about how the representativeness of the microcosm was affected by the other stages of selection. It is a demanding task to volunteer to give up nearly a year of one’s life. How did those who put themselves forward for this opportunity compare to those who did not, or, in other words, how do they compare to the rest of the population for whom they are supposed to be a random microcosm?