 Home Business & Finance  # The method of recycled predictions

The results in this chapter are presented in terms of the probabilities computed from equations (2.1) and (2.2), using the method “recycled predictions” described in Long and Freese (2014, chapter 4) and in a STATA manual.8 Since this method underpins the results presented in this chapter it is useful, at the very outset, to describe it in some detail. The variables Y; in equation (2.1) and Z( in equation (2.2) are defined over persons distinguished by different characteristics - by social group, social status, educational attainment, and so on.

Suppose that one of these characteristics is social group and persons as identified inter alia by whether they belong to a “dominant” or a “subordinate” group. The object is to identify the probabilities of having a particular condition which can be entirely ascribed to group membership and, furthermore, to test whether these differ significantly between those in the dominant and subordinate group. The method of “recycled predictions” enables one to do this.

Suppose that the first variable relates to a person’s group so that Xn = 1 if person i is from the dominant group, Xu = 2 if he or she is from a subordinate group. For ease of exposition, assume that the respondents are ordered so that the first M respondents are from the dominant group Xjt = 1 for i = 1,. .., M and Xit = 2 for i = M + 1,. . ., N. Now, using the logit estimates from equation (2.1), one can predict for each person his or her probability of being happy. This probability of being happy is denoted p.(i = 1,..., N).

The mean of the p, defined over all the N persons in the estimation sample will be the same as the (estimation) sample proportion of persons that said they were happy (i.e., persons for whom Y; = 1). Similarly, the mean of the pi defined over the M dominant, and the N - M subordinate, group persons will be the same as the (estimation) sample proportion of persons from these two groups that said they were happy. In other words, the estimated logit equation passes through the sample means.9

However, the difference between the two sample means, dominant (pD) and subordinate (ps), does not reflect the differences, due solely to group membership, between persons from the two groups in their probabilities of being happy. This is because persons from the two groups differ not just in terms of group identity but also with respect to variables like social class and education, among others. Computing the mean probabilities over each subgroup will not neutralise these differences, and hence, differences between pD and ps cannot be attributed solely - although, of course, some part may be attributable - to differences in group membership.

The method of “recycled predictions” isolates the group effect on the predicted probability of dominant and subordinate group persons of being happy. First, “pretend” that all N persons in the estimation sample are from the dominant group. Holding the values of the other variables constant (either to their observed sample values, as in this chapter, or to their mean values), compute the average probability of being happy under this assumption and denote it pD. Next, “pretend” that all N persons in the estimation sample are from the subordinate group and, again holding the values of the other variables constant, compute the average probability of being happy under this assumption and denote it ps.

Since the values of the non-group variables are unchanged between these two hypothetical scenarios, the only difference between them is that, in the first scenario, the dominant group coefficient is “switched on” (with the subordinate group coefficient “switched off”), while, in the other scenario, the subordinate coefficient is “switched on” (with the dominant group coefficient “switched off”), for all the N persons in the estimation sample.10 Consequently, the difference between and ps is entirely due to differences in group membership.

Similarly, using the multinomial logit estimates from equation (2.2), one can predict for each person his or her probability of low/moderate/high life satisfaction: qf, qf1, and qf. Again, using the two hypothetical scenarios - all persons from, respectively, the dominant and subordinate groups - one can construct the average probabilities of low/moderate/high satisfaction under these two scenarios and denote them q'j,,^, and q" for the dominant group scenario and q ,qf, and q" for the subordinate group scenario. Then the difference between the predicted probability of low/moderate/high satisfaction between the dominant and subordinate groups - q'D and q'D, q£f and qfj, and qf and q^ - can be entirely ascribed to group membership since the only thing that was changed between each pair of probabilities was group membership.

In essence, therefore, in evaluating the effect of two characteristics A and В on the likelihood of a particular outcome, the method of “recycled predictions” compares two sets of average probabilities: first, under an “all have the characteristic A” scenario and, then, under an “all have the characteristic B” scenario, with the values of the other variables remaining unchanged between the scenarios. The difference in the two probabilities is then entirely due to the attributes represented by A and В (in this case, differences between dominant and subordinate group membership). These probabilities, respectively, pA and pB, are referred to in this chapter as the predicted probabilities (PPs) of an event under A and B. So, for example, in the earlier exposition, pD and ps refer to the PPs of persons from the dominant and subordinate groups being happy, qд, q" refer to the PPs of persons from the dominant group - and q!s', qf, and q'f refer to the PPs of persons from the subordinate group - having low/moderate/high levels of satisfaction.

 Related topics