CS256
Problem Set 2
2/9/17

Model Development

My a-priori expectations for the effect of the various alternative-specific variables and choice of flight are described in Table \ref{tab:tab1}

\label{tab:tab1}A-priori assumptions about the effect of alternative-specific variables on flight choice. Note that age, gender, income could likely interact with all variables, but the table above identifies my assumptions about the variables with which they would have the strongest interaction
Variable Influence on P(i) Potential Interactions
Travel Time -, large Age (-), Ticket Class (+ for non-economy)
Aircraft small preference for non-propeller plane
Arrival-Time Difference -, more - for being late than early Trip Purpose (magnitude increase for work trips)
On-time Performance +, small
Fare -, large (if paid for directly) Payment (- for self-pay) , Income (+ for high-income)
Airline +, large if airline matches basic and/or elite status of individual. Maybe additional preference for certain airlines regardless of status (e.g. b/c of reputation)
Departure Time - for late night, early morning (11 PM-8 AM?) + for other times Trip Purpose (+ for work trip), Age (-)
Arrival Time - for late night, early morning (11 PM-8 AM?) + for other times Trip Purpose (+ for work trip) , Age (-)

The justification for assignment of signs seems clear in most cases. Perhaps the most non-obvious assumption is the influence of trip purpose on arrival and departure time effects. In this case, the justification is that it can be beneficial for work trips to be red-eyes.

A model that fully represents these a-priori assumptions, with some additional arbitrary thresholds employed to reduce the number of parameters for multi-interaction terms, is shown in Equation \ref{eq:model1}.

\begin{align} \label{eq:model1} P(aX)= & \hat{\beta}_{1}travTime\times ageGT55\times nonEconomy+\beta_{2}aXpropeller+\hat{\beta}_{3}aXlateness\times workTrip\notag \\ & +\hat{\beta}_{4}aXearliness\times workTrip+\hat{\beta}_{5}aXperformance+\hat{\beta}_{6}aXfare\times nonSelfPay\times incomeGT100\notag \\ & +\hat{\beta}_{7}aXbasicMatch+\hat{\beta}_{8}aXeliteMatch+\hat{\beta}_{9}aXdepartEL\times workTrip\times ageGT55\notag \\ & \label{eq:model1}+\beta_{10}aXarriveEL\times workTrip\times ageGT55\\ \end{align}

where all variables are as defined in the assignment and those not in the assignment are defined as follows:

  • \(aX[depart,arrive]EL:=(aX[depart,arrive]<480\text{ or }aX[depart,arrive]>1380\) - an indicator of whether the departure (or arrival) occurs between 11 PM and 8 AM

  • \(aXpropeller:=(aXaircraft==4)\)

  • \(workTrip:=(purpose\in\{1,2,5\})\)

  • \(ageGT55:=(age>5)\)

  • \(nonEconomy:=(classTicket>2)\)

  • \(aXlateness:=(aXtimediff>0)*aXtimediff\)

  • \(aXearliness:=-(aXtimediff<0)*aXtimediff\)

  • \(aXabsTimediff:=|aXtimediff|\)

  • \(incomeGT100:=(income>7)\)

  • \(nonSelfPay:=(payment==1)\)

  • \(aX[basic,elite]Match\) is an indicator of whether the individual is a frequent flier (basic or elite) of the airline in an alternative

In the interest of readability, categorical variables and interaction terms are not explicitly expanded in the above equation; they were expanded, of course, in the model that was run. Additionally, in the interest of parameter reduction, multi-interaction terms specified in the model above are not truly multi-interactions; instead they are the first variable interacted with the second, then separately interacted with the third, etc. For example,

\begin{equation} \hat{\beta}_{1}travTime\times ageLT55\times economy:=\beta_{1,1}travTime+\beta_{1,2}[travTime][ageLT55]+\beta_{1,3}[travTime][economy]\nonumber \\ \end{equation}

The model is specified this way to avoid having to estimate, say, ”the differential impact that age has on the impact that ticket class has on the impact of travel time on airline choice”. Instead, I simply separately estimate ”the impact of age on the impact of travel time” and ”the impact of ticket class on the impact of travel time” on airline choice.

As might be expected given this fairly parameter-intense initial model, many effects were not well-identified. Reassuringly, no parameter estimates were significant and opposite in sign to the initial hypothesis. However, numerous non-significant (at the \(p<.05\) or \(p<.1\) levels) relationships exist in the results, including the majority of the interaction parameters. Additionally, the parameters associated with ”lateness” and ”earliness” were estimated to be surprisingly similar. After iteratively removing non-significant interactions and slightly changing the feature set (e.g. to combine ”lateness” and ”earliness” into ”abs(timeDiff)”), we arrive at the following final model specification (using the same notation as described for the initial model):

\begin{align} \label{eq:modelFinal} P(aX)= & \hat{\beta}_{1}travTime\times female\times nonEconomy+\beta_{2}aXpropeller+\hat{\beta}_{3}aXlateness^{2}+\hat{\beta}_{4}aXearliness^{2}\notag \\ & +\hat{\beta}_{5}aXperformance+\hat{\beta}_{6}aXfare\times lowIncome\times highIncome\times female\times paymentFamily\notag \\ & \label{eq:modelFinal}+\hat{\beta}_{7}aXbasicMatch+\hat{\beta}_{8}aXeliteMatch\\ \end{align}

where

  • \(female:=(gender==2)\)

  • \(lowIncome:=(income<\$150k)\)

  • \(highIncome:=(income>\$250k)\)

  • \(paymentFamily:=(payment==4)\)

.

The results of this final specification are displayed in Table \ref{tab:final}:

No. Observations: 7,024
Df Model: 15
Pseudo R-squ.: 0.195
Pseudo R-bar-squ.: 0.192
Log-Likelihood: -3,920.076
\label{tab:final}
Multinomial Logit Model Regression Results from Final Specification. All variables included were specified for the utility functions of both choice options.
coef std err z P\(>\)\(|\)z\(|\) [95.0% Conf. Int.]
Travel Time (economy class), hrs -0.4977 0.034 -14.650 0.000 -0.564 -0.431
Travel Time X Non-Economy Class, hrs 0.1012 0.061 1.655 0.098 -0.019 0.221
Travel Time X Female, hrs -0.1438 0.043 -3.332 0.001 -0.228 -0.059
Propeller Plane -0.2862 0.065 -4.399 0.000 -0.414 -0.159
Lateness\({}^{2}\) , hrs\({}^{2}\) -0.0237 0.013 -1.793 0.073 -0.050 0.002
Earliness\({}^{2}\), hrs\({}^{2}\) -0.1014 0.049 -2.066 0.039 -0.198 -0.005
On-time performance, % 0.0159 0.002 9.124 0.000 0.012 0.019
Fare (med income: $150-250k, male, payment not by family/friend), $100 -0.2678 0.057 -4.717 0.000 -0.379 -0.157
Fare X low income, $100 -0.1657 0.060 -2.772 0.006 -0.283 -0.049
Fare X high income, $100 0.2422 0.077 3.166 0.002 0.092 0.392
Fare X (payment by family/friend), $100 -0.2191 0.105 -2.081 0.037 -0.425 -0.013
Fare X female, $100 -0.2091 0.042 -5.006 0.000 -0.291 -0.127
Basic member of airline FFP (non-work-trip) 0.4109 0.070 5.881 0.000 0.274 0.548
Elite member of airline FFP (non-work-trip) 0.6560 0.114 5.777 0.000 0.433 0.879
FFP member X work-trip 0.1837 0.105 1.745 0.081 -0.023 0.390