Discrete choice models are widely used to describe how people behave when they have to choose between alternatives. Often the choice sets are very large. Then it may require excessive computational power to estimate the parameters in the model. In fact, sometimes the problem is even worse, just reading the data is a prohibitive task. In the context of the model that will be used in this thesis – a conditional logit model for which the independence from irrelevant alternatives (IIA) property is imposed – reducing the number of choice alternatives by taking a random draw from the full choice set including the actual chosen alternatives, called the simple random sampling (SRS), yields estimates that are still consistent. This has been proven by McFadden (1978). An additional advantage of sampling of alternatives is simplifying data collection and analysis. There is an extensive body of literature on discrete choice models with large choice sets. A number of sampling strategies other than the SRS to deal with computational problems is available. These include stratified sampling and importance sampling. In the present context, the last two provide consistent parameter estimates as well, provided the likelihood function is modified. However, to the best of our knowledge, only the SRS is widely used for the discrete choice models. Hardly any research has been done on the empirical performance of model estimation that results from other sampling strategies. In this thesis, we show, theoretically and empirically, the consistency of the maximum likelihood estimates of a conditional logit model based on the three strategies mentioned above. Moreover, we investigate the impact of the size of the sampled choice sets on the empirical accuracy of the estimated parameters. The methods are tested empirically using real-life data on job location choices of Dutch commuters . All three strategies yield empirically consistent estimates. It turns out that stratified sampling works better than the SRS, and that a sampling strategy is made more efficient if one gives in the sampling process more weight to alternatives that are thought to be more likely to be chosen. The work in this thesis suggests that it is necessary to investigate the performance of some variants of the importance sampling strategy. Moreover, it turns out that although theoretically the size of sampled choice sets should not affect the accuracy of parameter estimates, it does so in the empirical application. The estimates are more accurate when choice sets with a larger number of alternatives are used. Earlier studies suggest to use sampled choice sets with a size that is not lower than 12.5% of the full choice set. In the present context, we still obtain reliable estimates using sampled choice sets that contain less than 3% of the choice alternatives from the full choice set

, , , ,
Brinkhuis, J.
hdl.handle.net/2105/13706
Econometrie
Erasmus School of Economics

Ji, X. (Xichen). (2013, July 24). Sampling of alternatives for discrete choice models. Econometrie. Retrieved from http://hdl.handle.net/2105/13706