Empirical Analysis
Data
The data used in the present chapter is the same as that in Chorus and Bierlaire (2013). The data collection effort focused on route choice behaviour among commuters who travel from home to work by car. A total of 550 people were sampled from an internet panel maintained by IntoMart in April 2011. Sampled individuals were at least 18 years old, owned a car, and were employed. The sample was representative of Dutch commuters in terms of gender, age, and education level. Of the 550 sampled individuals, 390 filled out the survey (implying a response rate of 71%).
Respondents to the survey were asked to imagine the hypothetical situation where they were planning a new commute from home to work (either because they had recently moved, or because their employer had recently moved, or because they had started a new job). They were asked to choose between three different routes that differed in terms of the following four attributes, with three levels each: average doortodoor travel time (45, 60, 75 minutes), percentage of travel time spent in traffic jams (10%, 25%, 40%), travel time variability (±5, ±15, ± 25 minutes) and total costs (€5.5, €9, €12.5).
Using the Ngenesoftware package, a socalled 'optimal orthogonal in the differences' design of choice sets was created to ensure a statistically efficient data collection. This design resulted in nine choice tasks per respondent and 3510 choice observations in total. Table 2.1 shows one of these tasks.
Table 2.1: An example route choicetask.
Route A 
Route B 
Route C 

Average travel time (minutes) 
45 
60 
75 
Percentage of travel time in congestion (%) 
10% 
25% 
40% 
Travel time variability (minutes) 
±5 
±15 
±25 
Travel costs (Euros) 
€12.5 
€9 
€5.5 
YOUR CHOICE 
□ 
□ 
□ 
Results
A large number of different models were estimated for this study, gradually increasing the number of classes within the latent class structure discussed in Section 2.2, with each class making use of a GRRM model, where, in varying classes, a priori structures were either imposed (e.g. pure RUM, pure RRM) or the γ parameters were estimated. All models were coded and estimated in Ox (Doornik, 2001), making use of multiple runs with different starting values to reduce the risk of inferior local maxima.
The models presented here are generally those with any additional constraints on γ imposed which arose from the modelling work, where this relates to cases where δ became either very small or very large. The models are compared in terms of the Bayesian Information Criterion (BIC), which incurs a higher penalty for increases in the number of parameters than traditional likelihood ratio or adjusted p2 measures.
We begin with Table 2.2, which shows three models with a single class each, that is S=1. In models 1 and 2, we a priori impose a RUM and RRM structure, respectively, that is constraining γ = 0 in model 1, and γ= 1 in model 2, across attributes. The results show that the pure RRM structure (model 2) is preferred to the pure RUM structure (model 1), with both producing four significant negative coefficient estimates.
In model 3, we initially estimated all four δ parameters freely, but the values for δ(perc. in congestion) and δ (travel cost) reached very large positive values which indicated that γ=1 for these two attributes, meaning a pure RRM treatment. For the remaining two parameters, the δ estimates are negative albeit not significantly different from zero, indicating a value for γ<0.5. As can be seen from Figure 2.1, this is still closer to full RRM treatment than to full RUM treatment, a finding that is also in line with the small difference in fit between models 2 and 3, with the BIC measure indicating that the additional two parameters in model 3 do not justify the improvement in loglikelihood. The findings from the first three models hence point towards a pure RRM treatment for all four parameters. Even though the GRRM model in the end thus 'collapses' to a simpler RRM model, it highlights the advantage of the structure as a diagnostic tool, rather than having to estimate all possible combinations of RUM and RRM treatment for the different parameters.
Table 2.2: Single class models.
Model 1 
Model 2 
Model 3 

Loglikelihood 
2613.45 
2604.96 
2603.26 

Parameters 
4 
4 
6 

Adj p2 
0.3212 
0.3234 
0.3233 

BIC 
5252.20 
5235.22 
5244.47 

est. 
rob. trat. 
est. 
rob. trat. 
est. 
rob. trat. 

β(travel time) 
0.0224 
22.11 
0.0469 
20.57 
0.0330 
3.91 
β(perc. in congestion) 
0.0091 
15.21 
0.0181 
14.44 
0.0188 
14.78 
β(travel time variability) 
0.0105 
11.83 
0.0210 
11.80 
0.0147 
0.68 
β(travel cost) 
0.0576 
16.83 
0.1128 
15.59 
0.1166 
16.66 
δ(travel time) 
 inf (fixed a priori) 
+ inf (fixed a priori) 
0.3622 
0.27 

δ(perc. in congestion) 
 inf (fixed a priori) 
+ inf (fixed a priori) 
+ inf (fixed) 

δ(travel time variability) 
 inf (fixed a priori) 
+ inf (fixed a priori) 
0.4656 
0.05 

δ(travel cost) 
 inf (fixed a priori) 
+ inf (fixed a priori) 
+ inf (fixed) 

γ(travel time) 
0 
1 
0.41 

γ(perc. in congestion) 
0 
1 
1 

γ(travel time variability) 
0 
1 
0.39 

γ(travel cost) 
0 
1 
1 
We next move to four separate models using two classes each in Table 2.3. This includes one model (model 4) with two pure RUM classes, one (model 5) with two pure RRM classes, one with one pure RUM class and one pure RRM class (model 6) and one with two GRRM classes (model 7). All four models show major improvements over the single class models in Table 2.2. They also all show a split into one class with a roughly 1/3 probability, and one with a roughly 2/3 probability, where a very consistent pattern emerges across the four models. In the second class, the relative importance of travel time and travel time variability is reduced substantially, especially the former, while the importance of congestion and to a lesser extent travel cost is increased. Across the four models, the impact of congestion also only attains low levels of significance in the first class.
In terms of model structure, we first note that model 5 outperforms model 4, that is a structure with two pure RRM classes is preferred to a structure with two pure RUM classes. However, model 6, which a priori imposes one pure RUM class and one pure RRM class, obtains an even better BIC measure, suggesting heterogeneity across respondents not just in sensitivities but also in decision rules. Model 7 relaxes the assumption in model 6 about within class homogeneity of the decision rule for different attributes. The findings for the second class remain the same as the imposition of a pure RRM treatment for all attributes in model 6. After additional constraints for δ1(% cong.), δ1(tt var) and δ1(cost), the picture in the first class is slightly different from model 6. We still see a pure RUM treatment for congestion and travel time variability. However, for travel time, we see a γ value of around one third, indicating a treatment that is not pure RUM and actually suggests a fairly regretbased treatment, while, for cost, the treatment is purely regret based. Despite these differences, the loglikelihood for models 6 and 7 is essentially the same, where the one additional parameter (δ1(trav. time)) in model 7 leads to a higher (worse) BIC. The recommendations from the two class structures would thus be a further constrained version of model 7, using the GRRM model once again as a specification search tool, leading to a mixed RUMRRM treatment in one class, and a pure RRM treatment in the second class.
Table 2.4 contains our final set of four models, where models with more classes than those presented in Table 2.4 gave a worse BIC. To allow for a generic format in the presentation of results, Table 2.4 makes use of six classes throughout, where the first two are pure RUM classes, followed by two pure RRM classes, with the fifth and sixth being freely estimated GRRM models (subject to additional constraints). Not each model makes use of each class, as detailed in the table.
Model 8 combines a pure RUM class (class 1) with a pure RRM class (class 3) and a GRRM class (class 5). This model outperforms the best two class models from Table 2.3, where the findings suggest that, in the third class, all parameters except travel time variability should have a pure RRM treatment, where the estimate for travel time variability is however also not statistically significant in this class. This third class obtains around one quarter of the overall weight in the model, with a slightly higher probability for the other (a priori) pure RRM class than for
Table 2.3: Two class models.
Model 4 
Model 5 
Model 6 
Model 7 

Loglikelihood 
2431.59 
2416.78 
2412.92 
2412.83 

Parameters 
9 
9 
9 
10 

Adj p2 
0.3671 
0.3709 
0.3719 
0.3717 

BIC 
4920.11 
4890.49 
4882.77 
4888.91 

est. 
rob. trat. 
est. 
rob. trat. 
est. 
rob. trat. 
est. 
rob. trat. 

β1 (trav. time) 
0.0559 
10.15 
0.1582 
5.44 
0.0559 
9.71 
0.0558 
9.77 
β1 (% cong.) 
0.0025 
1.44 
0.0052 
1.35 
0.0030 
1.65 
0.0034 
1.91 
β1 (tt var) 
0.0261 
6.33 
0.0510 
4.54 
0.0260 
6.20 
0.0259 
6.22 
β1 (cost) 
0.0437 
4.63 
0.0864 
4.36 
0.0404 
4.16 
0.0808 
4.16 
β2 (trav. time) 
0.0146 
12.81 
0.0314 
11.23 
0.0310 
12.23 
0.0309 
12.19 
β2 (% cong.) 
0.0131 
13.38 
0.0266 
12.37 
0.0275 
12.74 
0.0276 
12.71 
β2 (tt var) 
0.0088 
7.92 
0.0182 
8.27 
0.0180 
7.88 
0.0180 
7.87 
β2 (cost) 
0.0725 
15.72 
0.1451 
14.05 
0.1495 
13.75 
0.1496 
13.76 
δ1 (trav. time) 
— inf (fixed a priori) 
+ inf (fixed a priori) 
— inf (fixed a priori) 
0.6918 
4.10 

δ1 (% cong.) 
— inf (fixed a priori) 
+ inf (fixed a priori) 
— inf (fixed a priori) 
 inf (fixed) 

δ1 (tt var) 
— inf (fixed a priori) 
+ inf (fixed a priori) 
— inf (fixed a priori) 
 inf (fixed) 

δ1 (cost) 
— inf (fixed a priori) 
+ inf (fixed a priori) 
— inf (fixed a priori) 
+ inf (fixed) 

γ1 (trav. time) 
0 
1 
0 
0.33 

γ1 (% cong.) 
0 
1 
0 
0.00 

γ1 (tt var) 
0 
1 
0 
0.00 

γ1 (cost) 
0 
1 
0 
1.00 

δ2 (trav. time) 
— inf (fixed a priori) 
+ inf (fixed a priori) 
+ inf (fixed a priori) 
+ inf (fixed) 

δ2 (% cong.) 
— inf (fixed a priori) 
+ inf (fixed a priori) 
+ inf (fixed a priori) 
+ inf (fixed) 

δ2 (tt var) 
— inf (fixed a priori) 
+ inf (fixed a priori) 
+ inf (fixed a priori) 
+ inf (fixed) 

δ2 (cost) 
— inf (fixed a priori) 
+ inf (fixed a priori) 
+ inf (fixed a priori) 
+ inf (fixed) 

γ2 (trav. time) 
0 
1 
1 
1.00 

γ2 (% cong.) 
0 
1 
1 
1.00 

γ2 (tt var) 
0 
1 
1 
1.00 

γ2 (cost) 
0 
1 
1 
1.00 

δs1 
0.6917 4.18 
0.77489 3.88 
0.6956 4.14 
0.6918 4.10 

δs2 
0 
0 
0 
0 

π_{1} 
33.36% 
31.54% 
33.28% 
33.36% 

π_{2} 
66.64% 
68.46% 
66.72% 
66.64% 
Table 2.4: Three to six class models.
Model 8 
Model 9 
Model 10 
Model 11 

Loglikelihood 
2347.19 
2322.86 
2301.42 
2284.89 

Parameters 
15 
19 
25 
30 

Adj p2 
0.3874 
0.3927 
0.3967 
0.3997 

BIC 
4789.26 
4765.90 
4760.98 
4759.54 

est. 
rob. trat. 
est. 
rob. trat. 
est. 
rob. trat. 
est. 
rob. trat. 

β1 (trav. time) 
0.0531 
11.35 
0.5172 
0.00 
0.0223 
2.47 
0.0040 
1.57 

β1 (% cong.) 
0.0043 
2.18 
0.2440 
0.00 
0.0347 
3.92 
0.0119 
2.54 

β1 (tt var) 
0.0250 
7.09 
0.1769 
0.35 
0.0129 
1.84 
0.0104 
3.73 

β1 (cost) 
0.0354 
4.29 
4.0859 
0.41 
0.0630 
2.16 
0.0222 
1.73 

β2 (trav. time) 
— 
0.0548 
10.78 
0.0598 
7.90 
0.0363 
2.38 

β2 (% cong.) 
— 
0.0041 
2.06 
0.0063 
1.34 
0.0277 
4.24 

β2 (tt var) 
— 
0.0261 
6.85 
0.0135 
2.67 
0.0085 
0.62 

β2 (cost) 
— 
0.0341 
4.05 
0.0370 
1.68 
0.0442 
0.78 

β3 (trav. time) 
0.0654 
3.75 
0.0151 
4.07 
0.0982 
2.07 
0.6426 
0.88 

β3 (% cong.) 
0.0306 
8.00 
0.0271 
8.66 
0.0135 
0.39 
0.0653 
0.61 

β3 (tt var) 
0.0158 
4.54 
0.0211 
6.08 
0.0989 
4.70 
0.0359 
1.36 

β3 (cost) 
0.3049 
4.79 
0.0810 
6.20 
0.1308 
1.46 
1.0933 
0.62 

β4 (trav. time) 
— 
0.1007 
4.91 
0.0783 
2.77 
0.1529 
4.37 

β4 (% cong.) 
— 
0.0319 
7.35 
0.0257 
5.26 
0.0288 
1.93 

β4 (tt var) 
— 
0.0167 
4.27 
0.0150 
3.72 
0.0226 
3.19 

β4 (cost) 
— 
0.3873 
6.04 
0.3980 
4.32 
0.5812 
6.04 

β5 (trav. time) 
0.0115 
1.91 
— 
0.0084 
1.71 
0.0520 
1.98 

β5 (% cong.) 
0.0270 
6.71 
— 
0.0157 
2.50 
0.0036 
0.80 

β5 (tt var) 
0.0134 
0.26 
— 
0.0108 
0.17 
0.0400 
4.24 

β5 (cost) 
0.0652 
2.83 
— 
0.0692 
1.91 
0.0953 
3.03 

β6 (trav. time) 
 
 
 
0.0243 
2.69 

β6 (% cong.) 
 
 
 
0.0282 
1.68 

β6 (tt var) 
— 
— 
— 
0.0128 
1.70 

β6 (cost) 
 
 
 
0.2781 
3.78 

γ1 (generic) 
0 (fixed to pure RUM) 
0 (fixed to pure RUM) 
0 (fixed to pure RUM) 
0 (fixed to pure RUM) 

γ2 (generic) 
 
0 (fixed to pure RUM) 
0 (fixed to pure RUM) 
0 (fixed to pure RUM) 

γ3 (generic) 
1 (fixed to pure RRM) 
1 (fixed to pure RRM) 
1 (fixed to pure RRM) 
1 (fixed to pure RRM) 

γ4 (generic) 
 
1 (fixed to pure RRM) 
1 (fixed to pure RRM) 
1 (fixed to pure RRM) 

δ5 (trav. time) 
+ inf (fixed) 
 
+ inf (fixed) 
1.9107 0.49 

δ5 (% cong.) 
+ inf (fixed) 
 
+ inf (fixed) 
+ inf (fixed) 

δ5 (tt var) 
1.2144 0.05 
 
2.1617 0.03 
+ inf (fixed) 

δ5 (cost) 
+ inf (fixed) 
 
+ inf (fixed) 
+ inf (fixed) 

γ5 (trav. time) 
1.00 
 
1.00 
0.13 

γ5 (% cong.) 
1.00 
 
1.00 
0.00 

γ5 (tt var) 
0.23 
 
0.10 
0.00 

γ5 (cost) 
1.00 
 
1.00 
1.00 

δ6 (trav. time) 
 
 
 
+ inf (fixed) 

δ6 (% cong.) 
 
 
 
+ inf (fixed) 

δ6 (tt var) 
 
 
 
+ inf (fixed) 

δ6 (cost) 
 
 
 
+ inf (fixed) 

γ6 (trav. time) 
 
 
 
1.00 

γ6 (% cong.) 
 
 
 
1.00 

γ6 (tt var) 
 
1.00 

γ6 (cost) 
 
1.00 

δS1 
0.3460 0.97 
2.8410 6.95 
0.2268 0.40 
0.2737 0.35 

δS2 
 
0.0226 0.11 
0.3144 0.64 
0.1060 0.13 

δS3 
0.4757 0.99 
0.0383 0.16 
0.4835 0.78 
0.1725 0.30 

δS4 
 
0 
0.4541 0.81 
0.4091 0.67 

δS5 
0 
 
0 
0.1591 0.27 

δS6 
 
 
 
0 

π1 
35.14% 
1.92% 
14.88% 
18.03% 

π2 
 
33.61% 
25.56% 
15.25% 

π3 
40.00% 
31.62% 
11.51% 
16.29% 

π4 
 
32.86% 
29.39% 
20.64% 

π5 
24.86% 
 
18.66% 
16.08% 

π6 
 
 
 
13.71% 
the pure RUM class. These findings are overall compatible with those from model 7, albeit that congestion and travel time now have a pure RRM treatment outside the pure RUM class.
The remaining three models all make use of two RUM classes and two RRM classes, where models 10 and 11 add in one, respectively two, additional GRRM classes. While model 9 provides an improvement over previous structures, this is solely a result of allowing for taste heterogeneity within the RRM segment of the models (i.e. classes 3 and 4) as the additional RUM class (class 1) obtains a near zero weight (1.92%) with no significant parameter estimates. The fact that further improvements in fit are obtained in models 10 and 11 highlights the value of the GRRM approach in capturing attributespecific treatments of decision rules. Firstly, by allowing for these additional classes, we are able to capture some of the within decision rule heterogeneity for the RUM classes, where we now see a more even split in probability across RUM classes, along with some significant effects in both classes. The majority of the weight remains with a RRM treatment of attributes, whether in the pure RRM classes or the RRM treatment within GRRM classes. Additionally however, we see that more pure RRM classes are needed (given that one additional GRRM class in model 11 becomes pure RRM) before we can recover a class with a stronger mixture between RUM and RRM, namely class 5 in model 11.
Summary and Conclusions
There is growing interest in the notion that different decision rules may work differently well in explaining the choices observed in data used for travel behaviour analysis. Going further, there is now growing evidence that different decisionmakers may well be making their choices based on different rules. Finally, there are also results that suggest that the optimal decision rule may in fact vary across attributes within a given dataset. The present chapter has brought these different notions together by putting forward a latent class approach which not only allows for different decision rules across classes, but also differences in the decision rules used across attributes within a given class. This would clearly lead to a very large number of different possible combinations of rules across classes, and, within the context of two popular paradigms, RUM and RRM, we have proposed to address this through the use of a GRRM model within individual classes. This allows the optimal specification in terms of split between RUM and RRM within a given class to be revealed by the data during estimation, rather than needing to be imposed by the analyst. Initial findings on a standard stated choice dataset are promising and show how a rich pattern of taste heterogeneity, and decision rule heterogeneity across respondents and attributes can be revealed. It has also highlighted that, while in many cases, the GRRM models collapse to either RUM or RRM for individual coefficients, this provides a very useful diagnostic approach rather than needing to estimate each possible combination of RUM and RRM separately.