{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "multinomial_model.ipynb",
      "provenance": [],
      "collapsed_sections": [],
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3 (Spyder)",
      "language": "python3",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.7"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oeFDTihBDTnb"
      },
      "source": [
        "# Multinomial Logit"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "f2K2FpPHbdzT"
      },
      "source": [
        "This is a step-by-step guide on how to estimate Multinomial Logit models using the `xlogit` package. You can interactively execute the code in this guide by opening it Google Colab using the following link:\n",
        "\n",
        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/arteagac/xlogit/blob/master/examples/multinomial_model.ipynb)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "is9MSL-AkK9G"
      },
      "source": [
        "## Install `xlogit` package"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HEpWCRkFRm5t"
      },
      "source": [
        "Install `xlogit` using `pip` as shown below."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "36ZQw8iIkDib",
        "outputId": "4c3ef40d-20a7-4fc3-d10c-b282ab498c3a"
      },
      "source": [
        "!pip install xlogit"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Collecting xlogit\n",
            "  Downloading https://files.pythonhosted.org/packages/60/5f/9bc576d180c366af77bc04e268536e9e34be23c52a520918aa0cb56b438e/xlogit-0.1.3-py3-none-any.whl\n",
            "Requirement already satisfied: numpy>=1.13.1 in /usr/local/lib/python3.7/dist-packages (from xlogit) (1.19.5)\n",
            "Requirement already satisfied: scipy>=1.0.0 in /usr/local/lib/python3.7/dist-packages (from xlogit) (1.4.1)\n",
            "Installing collected packages: xlogit\n",
            "Successfully installed xlogit-0.1.3\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4gVLebey57-t"
      },
      "source": [
        "## Route Choice Dataset"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6TZFv__H3NmV"
      },
      "source": [
        "This dataset contains choices of 151 commuters among three Home-to-work route alternatives. The three alternatives are arterial, rural, and freeway roads. This dataset was taken from Example 13.1 of the book \"Statistical and econometric methods for transportation data analysis\" [(Washintong et. al., 2011) ](https://engineering.purdue.edu/~flm/StatEconBook.htm)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pMd_kflT6s_U"
      },
      "source": [
        "### Read data"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "RzpPDpOue-iB"
      },
      "source": [
        "We start by importing the data using pandas and renaming the columns of interest (choice, distance, male, and vehicle model). In addition, we create a column with the name of the alternatives and a column that uniquely identifies every observation in the dataset. Note that this dataset is long format."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 424
        },
        "id": "BiYw1Rbo6NIO",
        "outputId": "2e32651a-e256-4cc5-bab9-747bd93be82a"
      },
      "source": [
        "import pandas as pd\n",
        "import numpy as np\n",
        "df = pd.read_csv(\"https://engineering.purdue.edu/~flm/StatEcon-Files/Ex13-1.txt\",\n",
        "                 sep=\"\\t\", header=None, prefix=\"x\")\n",
        "df.rename(columns={'x0': 'choice', 'x6': 'dist', 'x10': 'male', 'x14': 'vehmodel'},\n",
        "          inplace=True)  # Rename columns of interest\n",
        "df['alt'] = np.tile(['arterial', 'rural', 'freeway'], len(df)//3)  # Add column with alternatives\n",
        "df['ids'] = np.repeat(np.arange(len(df)//3), 3)  # Add column with unique ids\n",
        "df['vehage'] = 86 - df['vehmodel']\n",
        "df"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>choice</th>\n",
              "      <th>x1</th>\n",
              "      <th>x2</th>\n",
              "      <th>x3</th>\n",
              "      <th>x4</th>\n",
              "      <th>x5</th>\n",
              "      <th>dist</th>\n",
              "      <th>x7</th>\n",
              "      <th>x8</th>\n",
              "      <th>x9</th>\n",
              "      <th>male</th>\n",
              "      <th>x11</th>\n",
              "      <th>x12</th>\n",
              "      <th>x13</th>\n",
              "      <th>vehmodel</th>\n",
              "      <th>x15</th>\n",
              "      <th>x16</th>\n",
              "      <th>alt</th>\n",
              "      <th>ids</th>\n",
              "      <th>vehage</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>460</td>\n",
              "      <td>14</td>\n",
              "      <td>48</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>86</td>\n",
              "      <td>0</td>\n",
              "      <td>28</td>\n",
              "      <td>arterial</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>440</td>\n",
              "      <td>7</td>\n",
              "      <td>44</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>86</td>\n",
              "      <td>0</td>\n",
              "      <td>28</td>\n",
              "      <td>rural</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>130</td>\n",
              "      <td>7</td>\n",
              "      <td>61</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>86</td>\n",
              "      <td>0</td>\n",
              "      <td>28</td>\n",
              "      <td>freeway</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>595</td>\n",
              "      <td>13</td>\n",
              "      <td>59</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>85</td>\n",
              "      <td>0</td>\n",
              "      <td>27</td>\n",
              "      <td>arterial</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>515</td>\n",
              "      <td>13</td>\n",
              "      <td>70</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>85</td>\n",
              "      <td>0</td>\n",
              "      <td>27</td>\n",
              "      <td>rural</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>448</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>325</td>\n",
              "      <td>10</td>\n",
              "      <td>70</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>84</td>\n",
              "      <td>1</td>\n",
              "      <td>24</td>\n",
              "      <td>rural</td>\n",
              "      <td>149</td>\n",
              "      <td>2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>449</th>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>200</td>\n",
              "      <td>5</td>\n",
              "      <td>74</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>84</td>\n",
              "      <td>1</td>\n",
              "      <td>24</td>\n",
              "      <td>freeway</td>\n",
              "      <td>149</td>\n",
              "      <td>2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>450</th>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>900</td>\n",
              "      <td>14</td>\n",
              "      <td>51</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>83</td>\n",
              "      <td>1</td>\n",
              "      <td>18</td>\n",
              "      <td>arterial</td>\n",
              "      <td>150</td>\n",
              "      <td>3</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>451</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>550</td>\n",
              "      <td>7</td>\n",
              "      <td>47</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>83</td>\n",
              "      <td>1</td>\n",
              "      <td>18</td>\n",
              "      <td>rural</td>\n",
              "      <td>150</td>\n",
              "      <td>3</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>452</th>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>220</td>\n",
              "      <td>7</td>\n",
              "      <td>64</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>83</td>\n",
              "      <td>1</td>\n",
              "      <td>18</td>\n",
              "      <td>freeway</td>\n",
              "      <td>150</td>\n",
              "      <td>3</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>453 rows × 20 columns</p>\n",
              "</div>"
            ],
            "text/plain": [
              "     choice  x1  x2  x3   x4  x5  ...  vehmodel  x15  x16       alt  ids  vehage\n",
              "0         1   1   0   0  460  14  ...        86    0   28  arterial    0       0\n",
              "1         0   0   1   0  440   7  ...        86    0   28     rural    0       0\n",
              "2         0   0   0   1  130   7  ...        86    0   28   freeway    0       0\n",
              "3         1   1   0   0  595  13  ...        85    0   27  arterial    1       1\n",
              "4         0   0   1   0  515  13  ...        85    0   27     rural    1       1\n",
              "..      ...  ..  ..  ..  ...  ..  ...       ...  ...  ...       ...  ...     ...\n",
              "448       0   0   1   0  325  10  ...        84    1   24     rural  149       2\n",
              "449       1   0   0   1  200   5  ...        84    1   24   freeway  149       2\n",
              "450       0   1   0   0  900  14  ...        83    1   18  arterial  150       3\n",
              "451       0   0   1   0  550   7  ...        83    1   18     rural  150       3\n",
              "452       1   0   0   1  220   7  ...        83    1   18   freeway  150       3\n",
              "\n",
              "[453 rows x 20 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 23
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "u0OvCmYp6fKi"
      },
      "source": [
        "### Create model specification"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vm_yhbPn4GBo"
      },
      "source": [
        "The following code creates the model specification by including additional variables in the dataframe to accomodate the specification needs. The newly added variables are simply the product of existing variables with dummy variables for the different alternatives. Note that the specification in the code below corresponds to the following utility maximization formulation:"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_ZJwXpOw5jfa"
      },
      "source": [
        "\\begin{equation}\n",
        "\\begin{array}{}\n",
        "V_{arterial} & = & \\quad \\beta_3DIST_{arterial} \\\\\n",
        "V_{rural} & = \\beta_1ASC_{rural} & +  \\beta_4DIST_{rural} & + \\beta_6VEHAGE_{rural}  \\\\\n",
        "V_{freeway} & = \\beta_2ASC_{freew} & +  \\beta_5DIST_{freeway} & +  \\beta_7{VEHAGE_{freeway}} & + \\beta_8MALE_{freeway} \n",
        "\\end{array}\n",
        "\\end{equation}"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "eC5Shb7S6eLE"
      },
      "source": [
        "# Alternative specific constants\n",
        "df['asc_rural'] = np.ones(len(df)) * (df['alt'] == 'rural')\n",
        "df['asc_freeway'] = np.ones(len(df)) * (df['alt'] == 'freeway')\n",
        "\n",
        "# Distance\n",
        "df['dist_arterial'] = df['dist'] * (df['alt'] == 'arterial')\n",
        "df['dist_rural'] = df['dist'] * (df['alt'] == 'rural')\n",
        "df['dist_freeway'] = df['dist'] * (df['alt'] == 'freeway')\n",
        "\n",
        "# Vehicle age\n",
        "df['vehage_rural'] = df['vehage'] * (df['alt'] == 'rural')\n",
        "df['vehage_freeway'] = df['vehage'] * (df['alt'] == 'freeway')\n",
        "\n",
        "# Male driver\n",
        "df['male_freeway'] = df['male'] * (df['alt'] == 'freeway')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "AAINE_sp6irt"
      },
      "source": [
        "### Estimate model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "w8zT7oWs_nz9"
      },
      "source": [
        "After creating the model specification, we can use `xlogit` to estimate the model as follows:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "ek-eceEl5-QQ",
        "outputId": "788668e8-1c68-4904-8a62-782c8121def6"
      },
      "source": [
        "from xlogit import MultinomialLogit\n",
        "varnames=['asc_rural', 'asc_freeway', 'dist_arter', 'dist_rural',\n",
        "          'dist_freew', 'vehage_rural', 'vehage_freew', 'male_freew']\n",
        "model = MultinomialLogit()\n",
        "model.fit(X=df[varnames], y=df['choice'], varnames=varnames,\n",
        "          ids=df['ids'], alts=df['alt'])\n",
        "model.summary()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Estimation time= 0.0 seconds\n",
            "---------------------------------------------------------------------------\n",
            "Coefficient              Estimate      Std.Err.         z-val         P>|z|\n",
            "---------------------------------------------------------------------------\n",
            "asc_rural               2.8131275     1.1504189     2.4453072        0.0416 *  \n",
            "asc_freeway            -2.6868817     2.2817329    -1.1775619         0.398    \n",
            "dist_arterial          -0.1229102     0.0240053    -5.1201358      4.14e-06 ***\n",
            "dist_rural             -0.1773579     0.0279818    -6.3383224       1.3e-08 ***\n",
            "dist_freeway           -0.0956391     0.0369681    -2.5870738        0.0295 *  \n",
            "vehage_rural            0.1236721     0.0535597     2.3090535         0.057 .  \n",
            "vehage_freeway          0.2268642     0.0755401     3.0032267       0.00969 ** \n",
            "male_freeway            0.5990000     0.6202114     0.9657996         0.499    \n",
            "---------------------------------------------------------------------------\n",
            "Significance:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n",
            "\n",
            "Log-Likelihood= -94.440\n",
            "AIC= 204.881\n",
            "BIC= 229.019\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Zf6sYPvjfoeP"
      },
      "source": [
        "## Swissmetro Dataset"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BOWB3Lffg5Qc"
      },
      "source": [
        "In this example, we will estimate a Multinomial Logit where each alternative is defined with a different utility specification. The swissmetro dataset is an SP/RP survey dataset popularly used in Biogeme and Pylogit examples. The dataset is available at http://transp-or.epfl.ch/data/swissmetro.dat and [Bierlaire et. al., (2001)](https://transp-or.epfl.ch/documents/proceedings/BierAxhaAbay01.pdf) provides a detailed discussion of the data as wells as its context and collection process. . Note that the dataset is available in wide format; therefore, we need to convert it to long format for `xlogit`."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "n4No84MAeFOM"
      },
      "source": [
        "### Read data"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mUemwG5YjaGg"
      },
      "source": [
        "The dataset is imported to the Python environment using `pandas`. Then, two types of samples, ones with a trip purpose different to commute or business and ones with an unknown choice, are filtered out. The original dataset contains 10,729 records, but after filtering, 6,768 records remain for following analysis. Finally, a new column that uniquely identifies each sample is added to the dataframe and the `CHOICE` column, which originally contains a numerical coding of the choices, is mapped to a description that is consistent with the alternatives in the column names. "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "4jqERhnWhGCc",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 444
        },
        "outputId": "f1ed82ab-6ea7-4e96-b2dc-bc9284ef5cd1"
      },
      "source": [
        "import pandas as pd\n",
        "import numpy as np\n",
        "\n",
        "df_wide = pd.read_table(\"http://transp-or.epfl.ch/data/swissmetro.dat\", sep='\\t')\n",
        "\n",
        "# Keep only observations for commute and business purposes that contain known choices\n",
        "df_wide = df_wide[(df_wide['PURPOSE'].isin([1, 3]) & (df_wide['CHOICE'] != 0))]\n",
        "\n",
        "df_wide['custom_id'] = np.arange(len(df_wide))  # Add unique identifier\n",
        "df_wide['CHOICE'] = df_wide['CHOICE'].map({1: 'TRAIN', 2:'SM', 3: 'CAR'})\n",
        "df_wide"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>GROUP</th>\n",
              "      <th>SURVEY</th>\n",
              "      <th>SP</th>\n",
              "      <th>ID</th>\n",
              "      <th>PURPOSE</th>\n",
              "      <th>FIRST</th>\n",
              "      <th>TICKET</th>\n",
              "      <th>WHO</th>\n",
              "      <th>LUGGAGE</th>\n",
              "      <th>AGE</th>\n",
              "      <th>MALE</th>\n",
              "      <th>INCOME</th>\n",
              "      <th>GA</th>\n",
              "      <th>ORIGIN</th>\n",
              "      <th>DEST</th>\n",
              "      <th>TRAIN_AV</th>\n",
              "      <th>CAR_AV</th>\n",
              "      <th>SM_AV</th>\n",
              "      <th>TRAIN_TT</th>\n",
              "      <th>TRAIN_CO</th>\n",
              "      <th>TRAIN_HE</th>\n",
              "      <th>SM_TT</th>\n",
              "      <th>SM_CO</th>\n",
              "      <th>SM_HE</th>\n",
              "      <th>SM_SEATS</th>\n",
              "      <th>CAR_TT</th>\n",
              "      <th>CAR_CO</th>\n",
              "      <th>CHOICE</th>\n",
              "      <th>custom_id</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>112</td>\n",
              "      <td>48</td>\n",
              "      <td>120</td>\n",
              "      <td>63</td>\n",
              "      <td>52</td>\n",
              "      <td>20</td>\n",
              "      <td>0</td>\n",
              "      <td>117</td>\n",
              "      <td>65</td>\n",
              "      <td>SM</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>103</td>\n",
              "      <td>48</td>\n",
              "      <td>30</td>\n",
              "      <td>60</td>\n",
              "      <td>49</td>\n",
              "      <td>10</td>\n",
              "      <td>0</td>\n",
              "      <td>117</td>\n",
              "      <td>84</td>\n",
              "      <td>SM</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>130</td>\n",
              "      <td>48</td>\n",
              "      <td>60</td>\n",
              "      <td>67</td>\n",
              "      <td>58</td>\n",
              "      <td>30</td>\n",
              "      <td>0</td>\n",
              "      <td>117</td>\n",
              "      <td>52</td>\n",
              "      <td>SM</td>\n",
              "      <td>2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>103</td>\n",
              "      <td>40</td>\n",
              "      <td>30</td>\n",
              "      <td>63</td>\n",
              "      <td>52</td>\n",
              "      <td>20</td>\n",
              "      <td>0</td>\n",
              "      <td>72</td>\n",
              "      <td>52</td>\n",
              "      <td>SM</td>\n",
              "      <td>3</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>130</td>\n",
              "      <td>36</td>\n",
              "      <td>60</td>\n",
              "      <td>63</td>\n",
              "      <td>42</td>\n",
              "      <td>20</td>\n",
              "      <td>0</td>\n",
              "      <td>90</td>\n",
              "      <td>84</td>\n",
              "      <td>SM</td>\n",
              "      <td>4</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8446</th>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>108</td>\n",
              "      <td>13</td>\n",
              "      <td>30</td>\n",
              "      <td>50</td>\n",
              "      <td>17</td>\n",
              "      <td>30</td>\n",
              "      <td>0</td>\n",
              "      <td>130</td>\n",
              "      <td>64</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>6763</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8447</th>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>108</td>\n",
              "      <td>12</td>\n",
              "      <td>30</td>\n",
              "      <td>53</td>\n",
              "      <td>16</td>\n",
              "      <td>10</td>\n",
              "      <td>0</td>\n",
              "      <td>80</td>\n",
              "      <td>80</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>6764</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8448</th>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>108</td>\n",
              "      <td>16</td>\n",
              "      <td>60</td>\n",
              "      <td>50</td>\n",
              "      <td>16</td>\n",
              "      <td>20</td>\n",
              "      <td>0</td>\n",
              "      <td>80</td>\n",
              "      <td>64</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>6765</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8449</th>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>128</td>\n",
              "      <td>16</td>\n",
              "      <td>30</td>\n",
              "      <td>53</td>\n",
              "      <td>17</td>\n",
              "      <td>30</td>\n",
              "      <td>0</td>\n",
              "      <td>80</td>\n",
              "      <td>104</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>6766</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8450</th>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>108</td>\n",
              "      <td>13</td>\n",
              "      <td>60</td>\n",
              "      <td>53</td>\n",
              "      <td>21</td>\n",
              "      <td>30</td>\n",
              "      <td>0</td>\n",
              "      <td>100</td>\n",
              "      <td>80</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>6767</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>6768 rows × 29 columns</p>\n",
              "</div>"
            ],
            "text/plain": [
              "      GROUP  SURVEY  SP   ID  ...  CAR_TT  CAR_CO  CHOICE  custom_id\n",
              "0         2       0   1    1  ...     117      65      SM          0\n",
              "1         2       0   1    1  ...     117      84      SM          1\n",
              "2         2       0   1    1  ...     117      52      SM          2\n",
              "3         2       0   1    1  ...      72      52      SM          3\n",
              "4         2       0   1    1  ...      90      84      SM          4\n",
              "...     ...     ...  ..  ...  ...     ...     ...     ...        ...\n",
              "8446      3       1   1  939  ...     130      64   TRAIN       6763\n",
              "8447      3       1   1  939  ...      80      80   TRAIN       6764\n",
              "8448      3       1   1  939  ...      80      64   TRAIN       6765\n",
              "8449      3       1   1  939  ...      80     104   TRAIN       6766\n",
              "8450      3       1   1  939  ...     100      80   TRAIN       6767\n",
              "\n",
              "[6768 rows x 29 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 2
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-GRMhgM2eIPz"
      },
      "source": [
        "### Reshape data"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "r9OxW-yNhcal"
      },
      "source": [
        "Given that `xlogit` requires the dataset to be provided in the long format, we reshape the dataset using the `wide_to_long` utility provided by `xlogit`. This function takes as input the column that uniquely identifies each sample (`id_col`), the name of column to save the alternatives (`alt_name`), the list of alternatives (`alt_list`), the columns that vary across alternatives (`varying`), and whether the alternative names are prefix in the column names (`alt_is_prefix`).The `wide_to_long` method fills with `NaN` the columns that do not have certain alternatives (e.g. `SEATS` and `HE`). Depending on your specification needs, you can ignore the `NaN` or replace them with zeros. In this case we replaced them with zeros using the `empty_val` parameter. Additional details about the `wide_to_long` function can be found in the [xlogit's documentation](https://xlogit.readthedocs.io/en/latest/api/utils.html)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 424
        },
        "id": "1KM-BvFvhWed",
        "outputId": "06502368-e8d6-4de3-be02-50d2db983472"
      },
      "source": [
        "from xlogit.utils import wide_to_long\n",
        "\n",
        "df = wide_to_long(df_wide, id_col='custom_id', alt_name='alt', sep='_',\n",
        "                  alt_list=['TRAIN', 'SM', 'CAR'], empty_val=0,\n",
        "                  varying=['TT', 'CO', 'HE', 'AV', 'SEATS'], alt_is_prefix=True)\n",
        "df"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>custom_id</th>\n",
              "      <th>alt</th>\n",
              "      <th>TT</th>\n",
              "      <th>CO</th>\n",
              "      <th>HE</th>\n",
              "      <th>AV</th>\n",
              "      <th>SEATS</th>\n",
              "      <th>GROUP</th>\n",
              "      <th>SURVEY</th>\n",
              "      <th>SP</th>\n",
              "      <th>ID</th>\n",
              "      <th>PURPOSE</th>\n",
              "      <th>FIRST</th>\n",
              "      <th>TICKET</th>\n",
              "      <th>WHO</th>\n",
              "      <th>LUGGAGE</th>\n",
              "      <th>AGE</th>\n",
              "      <th>MALE</th>\n",
              "      <th>INCOME</th>\n",
              "      <th>GA</th>\n",
              "      <th>ORIGIN</th>\n",
              "      <th>DEST</th>\n",
              "      <th>CHOICE</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>CAR</td>\n",
              "      <td>117</td>\n",
              "      <td>65</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>SM</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>0</td>\n",
              "      <td>SM</td>\n",
              "      <td>63</td>\n",
              "      <td>52</td>\n",
              "      <td>20</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>SM</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>0</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>112</td>\n",
              "      <td>48</td>\n",
              "      <td>120</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>SM</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>1</td>\n",
              "      <td>CAR</td>\n",
              "      <td>117</td>\n",
              "      <td>84</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>SM</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>1</td>\n",
              "      <td>SM</td>\n",
              "      <td>60</td>\n",
              "      <td>49</td>\n",
              "      <td>10</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>SM</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>20299</th>\n",
              "      <td>6766</td>\n",
              "      <td>SM</td>\n",
              "      <td>53</td>\n",
              "      <td>17</td>\n",
              "      <td>30</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>TRAIN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>20300</th>\n",
              "      <td>6766</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>128</td>\n",
              "      <td>16</td>\n",
              "      <td>30</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>TRAIN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>20301</th>\n",
              "      <td>6767</td>\n",
              "      <td>CAR</td>\n",
              "      <td>100</td>\n",
              "      <td>80</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>TRAIN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>20302</th>\n",
              "      <td>6767</td>\n",
              "      <td>SM</td>\n",
              "      <td>53</td>\n",
              "      <td>21</td>\n",
              "      <td>30</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>TRAIN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>20303</th>\n",
              "      <td>6767</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>108</td>\n",
              "      <td>13</td>\n",
              "      <td>60</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>TRAIN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>20304 rows × 23 columns</p>\n",
              "</div>"
            ],
            "text/plain": [
              "       custom_id    alt   TT  CO   HE  ...  INCOME  GA  ORIGIN  DEST  CHOICE\n",
              "0              0    CAR  117  65    0  ...       2   0       2     1      SM\n",
              "1              0     SM   63  52   20  ...       2   0       2     1      SM\n",
              "2              0  TRAIN  112  48  120  ...       2   0       2     1      SM\n",
              "3              1    CAR  117  84    0  ...       2   0       2     1      SM\n",
              "4              1     SM   60  49   10  ...       2   0       2     1      SM\n",
              "...          ...    ...  ...  ..  ...  ...     ...  ..     ...   ...     ...\n",
              "20299       6766     SM   53  17   30  ...       2   0       1     2   TRAIN\n",
              "20300       6766  TRAIN  128  16   30  ...       2   0       1     2   TRAIN\n",
              "20301       6767    CAR  100  80    0  ...       2   0       1     2   TRAIN\n",
              "20302       6767     SM   53  21   30  ...       2   0       1     2   TRAIN\n",
              "20303       6767  TRAIN  108  13   60  ...       2   0       1     2   TRAIN\n",
              "\n",
              "[20304 rows x 23 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 3
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-lInWLqXk0m8"
      },
      "source": [
        "We then scale some variables as per the [examples in Biogeme and Pylogit](https://https://github.com/timothyb0912/pylogit/blob/master/examples/notebooks/Main%20PyLogit%20Example.ipynb). The time and headway variables are converted to hours, the price is scaled, and the new variables `single_luggage`, `free ticket`, `multiple_luggage`, `regular_class` and `train_survey` are created to accommodate the model specification as shown below:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 444
        },
        "id": "7HirQJUhkTH6",
        "outputId": "03945233-8a3f-407a-f634-93b2c80abed3"
      },
      "source": [
        "# Scale travel time and headway variables to hours\n",
        "df['time'] = df['TT'] / 60.0 \n",
        "df['headway'] = df['HE'] / 60.0 \n",
        "df['cost'] = df['CO'] / 100.0\n",
        "\n",
        "# We set the cost as zero for individuals with an annual pass paid by employer\n",
        "annual_pass = (df['alt'].isin(['TRAIN', 'SM'])) & ((df[\"GA\"] == 1) | (df[\"WHO\"] == 2))\n",
        "df[\"cost\"] = df[\"cost\"] * (~annual_pass)\n",
        "\n",
        "#Travellers carrying only single luggage\n",
        "df[\"single_luggage\"] = (df[\"LUGGAGE\"] == 1).astype(int)\n",
        "\n",
        "#Travellers carrying more than one luggage\n",
        "df[\"multiple_luggage\"] = (df[\"LUGGAGE\"] == 3).astype(int)\n",
        "\n",
        "# Travellers travelling in classes other than First class\n",
        "df[\"regular_class\"] = 1 - df[\"FIRST\"]\n",
        "\n",
        "# Travellers who responded to the survey while on a train\n",
        "df[\"train_survey\"] = 1 - df[\"SURVEY\"]\n",
        "df"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>custom_id</th>\n",
              "      <th>alt</th>\n",
              "      <th>TT</th>\n",
              "      <th>CO</th>\n",
              "      <th>HE</th>\n",
              "      <th>AV</th>\n",
              "      <th>SEATS</th>\n",
              "      <th>GROUP</th>\n",
              "      <th>SURVEY</th>\n",
              "      <th>SP</th>\n",
              "      <th>ID</th>\n",
              "      <th>PURPOSE</th>\n",
              "      <th>FIRST</th>\n",
              "      <th>TICKET</th>\n",
              "      <th>WHO</th>\n",
              "      <th>LUGGAGE</th>\n",
              "      <th>AGE</th>\n",
              "      <th>MALE</th>\n",
              "      <th>INCOME</th>\n",
              "      <th>GA</th>\n",
              "      <th>ORIGIN</th>\n",
              "      <th>DEST</th>\n",
              "      <th>CHOICE</th>\n",
              "      <th>time</th>\n",
              "      <th>headway</th>\n",
              "      <th>cost</th>\n",
              "      <th>single_luggage</th>\n",
              "      <th>multiple_luggage</th>\n",
              "      <th>regular_class</th>\n",
              "      <th>train_survey</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>CAR</td>\n",
              "      <td>117</td>\n",
              "      <td>65</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>SM</td>\n",
              "      <td>1.950000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.65</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>0</td>\n",
              "      <td>SM</td>\n",
              "      <td>63</td>\n",
              "      <td>52</td>\n",
              "      <td>20</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>SM</td>\n",
              "      <td>1.050000</td>\n",
              "      <td>0.333333</td>\n",
              "      <td>0.52</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>0</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>112</td>\n",
              "      <td>48</td>\n",
              "      <td>120</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>SM</td>\n",
              "      <td>1.866667</td>\n",
              "      <td>2.000000</td>\n",
              "      <td>0.48</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>1</td>\n",
              "      <td>CAR</td>\n",
              "      <td>117</td>\n",
              "      <td>84</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>SM</td>\n",
              "      <td>1.950000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.84</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>1</td>\n",
              "      <td>SM</td>\n",
              "      <td>60</td>\n",
              "      <td>49</td>\n",
              "      <td>10</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>SM</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.166667</td>\n",
              "      <td>0.49</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>20299</th>\n",
              "      <td>6766</td>\n",
              "      <td>SM</td>\n",
              "      <td>53</td>\n",
              "      <td>17</td>\n",
              "      <td>30</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>0.883333</td>\n",
              "      <td>0.500000</td>\n",
              "      <td>0.17</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>20300</th>\n",
              "      <td>6766</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>128</td>\n",
              "      <td>16</td>\n",
              "      <td>30</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>2.133333</td>\n",
              "      <td>0.500000</td>\n",
              "      <td>0.16</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>20301</th>\n",
              "      <td>6767</td>\n",
              "      <td>CAR</td>\n",
              "      <td>100</td>\n",
              "      <td>80</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>1.666667</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.80</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>20302</th>\n",
              "      <td>6767</td>\n",
              "      <td>SM</td>\n",
              "      <td>53</td>\n",
              "      <td>21</td>\n",
              "      <td>30</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>0.883333</td>\n",
              "      <td>0.500000</td>\n",
              "      <td>0.21</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>20303</th>\n",
              "      <td>6767</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>108</td>\n",
              "      <td>13</td>\n",
              "      <td>60</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>939</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>7</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>TRAIN</td>\n",
              "      <td>1.800000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.13</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>20304 rows × 30 columns</p>\n",
              "</div>"
            ],
            "text/plain": [
              "       custom_id    alt   TT  ...  multiple_luggage  regular_class  train_survey\n",
              "0              0    CAR  117  ...                 0              1             1\n",
              "1              0     SM   63  ...                 0              1             1\n",
              "2              0  TRAIN  112  ...                 0              1             1\n",
              "3              1    CAR  117  ...                 0              1             1\n",
              "4              1     SM   60  ...                 0              1             1\n",
              "...          ...    ...  ...  ...               ...            ...           ...\n",
              "20299       6766     SM   53  ...                 0              0             0\n",
              "20300       6766  TRAIN  128  ...                 0              0             0\n",
              "20301       6767    CAR  100  ...                 0              0             0\n",
              "20302       6767     SM   53  ...                 0              0             0\n",
              "20303       6767  TRAIN  108  ...                 0              0             0\n",
              "\n",
              "[20304 rows x 30 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 4
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "L26tLi_C2v8-"
      },
      "source": [
        "### Create model specification"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dubKfhWYpam7"
      },
      "source": [
        "By operating the dataframe columns, highly-flexible utility specifications can be modeled in `xlogit`. As shown below, alternative specific constants or coefficients can be included in the specification by strategically creating new columns and setting their values depending on the alternative. This flexibility allows even the specification of one or multiple coefficients per alternative or group of alternatives."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "pA-zQcLcpcAF"
      },
      "source": [
        "# Create model specification\n",
        "# Alternative Specific Constants\n",
        "df['asc_train'] = np.ones(len(df))*(df['alt'] == 'TRAIN')\n",
        "df['asc_sm'] = np.ones(len(df))*(df['alt'] == 'SM')\n",
        "\n",
        "# Travel cost (One coefficient per alternative)\n",
        "df['cost_train'] = df['cost']*(df['alt'] == 'TRAIN')\n",
        "df['cost_sm'] = df['cost']*(df['alt'] == 'SM')\n",
        "df['cost_car'] = df['cost']*(df['alt'] == 'CAR')\n",
        "\n",
        "# Travel time (One coefficient for train and sm and other for car)\n",
        "df['time_train_sm'] = df['time']*((df['alt'] == 'TRAIN') | (df['alt'] == 'SM'))\n",
        "df['time_car'] = df['time']*(df['alt'] == 'CAR')\n",
        "\n",
        "# Headway (One coefficient per alternative, except for car)\n",
        "df['headway_train'] = df['headway']*(df['alt'] == 'TRAIN')\n",
        "df['headway_sm'] = df['headway']*(df['alt'] == 'SM')\n",
        "\n",
        "# Seat config (Coefficient only for swissmetro)\n",
        "df['seatconf_sm'] = df['SEATS']*(df['alt'] == 'SM')\n",
        "\n",
        "# Train Survey (Coefficient only for swissmetro)\n",
        "df['survey_train_sm'] = df['train_survey']* ((df['alt'] == 'TRAIN') | (df['alt'] == 'SM'))\n",
        "\n",
        "# Regular class (Coefficient only for swissmetro)\n",
        "df['regular_class_sm'] = df['regular_class']*(df['alt'] == 'TRAIN')\n",
        "\n",
        "# Luggage (Coefficient only for car)\n",
        "df['single_lug_car'] = df['single_luggage']*(df['alt'] == 'CAR')\n",
        "df['multip_lug_car'] = df['multiple_luggage']*(df['alt'] == 'CAR')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "POyPLthmeX9Z"
      },
      "source": [
        "### Estimate model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pHpj44RxoShn"
      },
      "source": [
        "The swissmetro dataset contains unbalanced choice situations across individuals (i.e., some individuals do not have observations for all alternatives). The `avail` option enables estimation for such datasets. `avail` takes the values that indicate the availability of each alternative across individuals.\n",
        "Once the model specification is complete, the model is estimated as follows:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "qXvKwbEloaDm",
        "outputId": "d2ab4c53-4d50-4408-f935-133a8376c68e"
      },
      "source": [
        "from xlogit import MultinomialLogit\n",
        "\n",
        "varnames=['asc_train', 'asc_sm', 'time_train_sm', 'time_car', 'cost_train',\n",
        "          'cost_sm', 'cost_car', 'headway_train', 'headway_sm', 'seatconf_sm',\n",
        "          'survey_train_sm', 'regular_class_sm', 'single_lug_car',\n",
        "          'multip_lug_car']\n",
        "model = MultinomialLogit()\n",
        "model.fit(X=df[varnames],\n",
        "          y=df['CHOICE'],\n",
        "          varnames=varnames,\n",
        "          alts=df['alt'],\n",
        "          ids=df['custom_id'],\n",
        "          avail=df['AV'])\n",
        "model.summary()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Estimation time= 0.1 seconds\n",
            "---------------------------------------------------------------------------\n",
            "Coefficient              Estimate      Std.Err.         z-val         P>|z|\n",
            "---------------------------------------------------------------------------\n",
            "asc_train              -1.2929512     0.1237556   -10.4476139      2.43e-24 ***\n",
            "asc_sm                 -0.5026152     0.1032927    -4.8659312      5.87e-06 ***\n",
            "time_train_sm          -0.6990098     0.0396510   -17.6290608      8.13e-67 ***\n",
            "time_car               -0.7229887     0.0442625   -16.3340968      1.18e-57 ***\n",
            "cost_train             -0.5618773     0.0807075    -6.9618968      2.59e-11 ***\n",
            "cost_sm                -0.2816843     0.0417373    -6.7489902       1.1e-10 ***\n",
            "cost_car               -0.5139009     0.0970406    -5.2957304      6.66e-07 ***\n",
            "headway_train          -0.3143519     0.0505955    -6.2130370      3.49e-09 ***\n",
            "headway_sm             -0.3773753     0.1652542    -2.2836046        0.0589 .  \n",
            "seatconf_sm            -0.7824379     0.0758912   -10.3100010      9.91e-24 ***\n",
            "survey_train_sm         2.5424946     0.0921336    27.5957408      1.5e-157 ***\n",
            "regular_class_sm        0.5650259     0.0652226     8.6630441      4.93e-17 ***\n",
            "single_lug_car          0.4227658     0.0611684     6.9115077      3.66e-11 ***\n",
            "multip_lug_car          1.4141058     0.2373032     5.9590672      1.62e-08 ***\n",
            "---------------------------------------------------------------------------\n",
            "Significance:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n",
            "\n",
            "Log-Likelihood= -5159.258\n",
            "AIC= 10346.517\n",
            "BIC= 10441.996\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "AEJIluSUqL9r"
      },
      "source": [
        "The estimates are identical to those provided by PyLogit (Brathwaite and Walker 2018) as shown in [this link](https://github.com/timothyb0912/pylogit/blob/master/examples/notebooks/Main%20PyLogit%20Example.ipynb) "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4Rur4sCUfMbi"
      },
      "source": [
        "## Fishing Dataset"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "REmsMhK8e41H"
      },
      "source": [
        "The following example illustrates the estimation of a Multinomial Logit model for choices of 1,182 individuals for sport fishing modes using `xlogit`. The goal is to analyze the market shares of four alternatives (i.e., beach, pier, boat, and charter) based on their cost and fish catch rate. [Cameron (2005)](http://cameron.econ.ucdavis.edu/mmabook/mma.html) provides additional details about this dataset. The following code illustrates how to use `xlogit` to estimate the model parameters."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CUDXAA26kOfK"
      },
      "source": [
        "### Read data"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JquXmr1xQo-C"
      },
      "source": [
        "The data to be analyzed can be imported to Python using any preferred method. In this example, the data in CSV format was imported using the popular `pandas` Python package. However, it is worth highlighting that `xlogit` does not depend on the `pandas` package, as `xlogit` can take any array-like structure as input. This represents an additional advantage because `xlogit` can be used with any preferred dataframe library, and not only with `pandas`."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 424
        },
        "id": "B5JFuzuIkIig",
        "outputId": "01b98073-5441-44fd-a656-6b5870582554"
      },
      "source": [
        "import pandas as pd\n",
        "df = pd.read_csv(\"https://raw.github.com/arteagac/xlogit/master/examples/data/fishing_long.csv\")\n",
        "df"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>alt</th>\n",
              "      <th>choice</th>\n",
              "      <th>income</th>\n",
              "      <th>price</th>\n",
              "      <th>catch</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>1</td>\n",
              "      <td>beach</td>\n",
              "      <td>0</td>\n",
              "      <td>7083.33170</td>\n",
              "      <td>157.930</td>\n",
              "      <td>0.0678</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>boat</td>\n",
              "      <td>0</td>\n",
              "      <td>7083.33170</td>\n",
              "      <td>157.930</td>\n",
              "      <td>0.2601</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>1</td>\n",
              "      <td>charter</td>\n",
              "      <td>1</td>\n",
              "      <td>7083.33170</td>\n",
              "      <td>182.930</td>\n",
              "      <td>0.5391</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>1</td>\n",
              "      <td>pier</td>\n",
              "      <td>0</td>\n",
              "      <td>7083.33170</td>\n",
              "      <td>157.930</td>\n",
              "      <td>0.0503</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>2</td>\n",
              "      <td>beach</td>\n",
              "      <td>0</td>\n",
              "      <td>1249.99980</td>\n",
              "      <td>15.114</td>\n",
              "      <td>0.1049</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4723</th>\n",
              "      <td>1181</td>\n",
              "      <td>pier</td>\n",
              "      <td>0</td>\n",
              "      <td>416.66668</td>\n",
              "      <td>36.636</td>\n",
              "      <td>0.4522</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4724</th>\n",
              "      <td>1182</td>\n",
              "      <td>beach</td>\n",
              "      <td>0</td>\n",
              "      <td>6250.00130</td>\n",
              "      <td>339.890</td>\n",
              "      <td>0.2537</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4725</th>\n",
              "      <td>1182</td>\n",
              "      <td>boat</td>\n",
              "      <td>1</td>\n",
              "      <td>6250.00130</td>\n",
              "      <td>235.436</td>\n",
              "      <td>0.6817</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4726</th>\n",
              "      <td>1182</td>\n",
              "      <td>charter</td>\n",
              "      <td>0</td>\n",
              "      <td>6250.00130</td>\n",
              "      <td>260.436</td>\n",
              "      <td>2.3014</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4727</th>\n",
              "      <td>1182</td>\n",
              "      <td>pier</td>\n",
              "      <td>0</td>\n",
              "      <td>6250.00130</td>\n",
              "      <td>339.890</td>\n",
              "      <td>0.1498</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>4728 rows × 6 columns</p>\n",
              "</div>"
            ],
            "text/plain": [
              "        id      alt  choice      income    price   catch\n",
              "0        1    beach       0  7083.33170  157.930  0.0678\n",
              "1        1     boat       0  7083.33170  157.930  0.2601\n",
              "2        1  charter       1  7083.33170  182.930  0.5391\n",
              "3        1     pier       0  7083.33170  157.930  0.0503\n",
              "4        2    beach       0  1249.99980   15.114  0.1049\n",
              "...    ...      ...     ...         ...      ...     ...\n",
              "4723  1181     pier       0   416.66668   36.636  0.4522\n",
              "4724  1182    beach       0  6250.00130  339.890  0.2537\n",
              "4725  1182     boat       1  6250.00130  235.436  0.6817\n",
              "4726  1182  charter       0  6250.00130  260.436  2.3014\n",
              "4727  1182     pier       0  6250.00130  339.890  0.1498\n",
              "\n",
              "[4728 rows x 6 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 7
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HOgCue_r_69x"
      },
      "source": [
        "### Estimate the model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Me6W8tAjQte6"
      },
      "source": [
        "Once the data is in the `Python` environment, `xlogit` can be used to fit the model, as shown below. The `MultinomialLogit` class is imported from `xlogit`, and its constructor is used to initialize a new model. The `fit` method estimates the model using the input data and estimation criteria provided as arguments to the method's call. The arguments of the `fit` methods are described in [`xlogit`'s documentation](https://https://xlogit.readthedocs.io/en/latest/api/).\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "JnYczrXDksg5",
        "outputId": "ef8a7f6e-9467-4dbc-e431-2c90d4127a16"
      },
      "source": [
        "from xlogit import MultinomialLogit\n",
        "\n",
        "varnames = ['income','price', 'catch']\n",
        "model = MultinomialLogit()\n",
        "model.fit(X=df[varnames],\n",
        "          y=df['choice'],\n",
        "          varnames=varnames,\n",
        "          isvars=['income'],\n",
        "          ids=df['id'],\n",
        "          alts=df['alt'],\n",
        "          fit_intercept=True)\n",
        "model.summary()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Estimation time= 0.0 seconds\n",
            "---------------------------------------------------------------------------\n",
            "Coefficient              Estimate      Std.Err.         z-val         P>|z|\n",
            "---------------------------------------------------------------------------\n",
            "_intercept.boat         0.5273413     0.2017396     2.6139700        0.0264 *  \n",
            "_intercept.charter      1.6943827     0.2096186     8.0831712       1.2e-14 ***\n",
            "_intercept.pier         0.7779899     0.2051062     3.7931076      0.000622 ***\n",
            "income.boat             0.0000894     0.0000473     1.8906643         0.134    \n",
            "income.charter         -0.0000333     0.0000480    -0.6936462         0.627    \n",
            "income.pier            -0.0001276     0.0000467    -2.7335425        0.0192 *  \n",
            "price                  -0.0251161     0.0015240   -16.4800784      5.89e-54 ***\n",
            "catch                   0.3578374     0.0985344     3.6315992       0.00113 ** \n",
            "---------------------------------------------------------------------------\n",
            "Significance:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n",
            "\n",
            "Log-Likelihood= -1215.138\n",
            "AIC= 2446.275\n",
            "BIC= 2486.875\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "22Ly9IcJQPMQ"
      },
      "source": [
        "## Heating Dataset"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HVA-R5U0T9m-"
      },
      "source": [
        "For this example, we use the Heating dataset from R's mlogit package (Croissant 2020), which contains choice of heating systems in California house. The dataset contains 90 observations, with 8 explanatory variables and more information can be found in https://cran.r-project.org/web/packages/mlogit/vignettes/e1mlogit.html."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zTJ27gpReSAd"
      },
      "source": [
        "### Read data"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "wmIl7b0iSse5",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 424
        },
        "outputId": "a4558c22-484a-4c45-e3d1-4eeb4fad0655"
      },
      "source": [
        "import pandas as pd\n",
        "import numpy as np\n",
        "\n",
        "df_wide = pd.read_csv(\"https://raw.github.com/arteagac/xlogit/master/examples/data/heating_wide.csv\")\n",
        "df_wide"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>idcase</th>\n",
              "      <th>depvar</th>\n",
              "      <th>ic.gc</th>\n",
              "      <th>ic.gr</th>\n",
              "      <th>ic.ec</th>\n",
              "      <th>ic.er</th>\n",
              "      <th>ic.hp</th>\n",
              "      <th>oc.gc</th>\n",
              "      <th>oc.gr</th>\n",
              "      <th>oc.ec</th>\n",
              "      <th>oc.er</th>\n",
              "      <th>oc.hp</th>\n",
              "      <th>income</th>\n",
              "      <th>agehed</th>\n",
              "      <th>rooms</th>\n",
              "      <th>region</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>1</td>\n",
              "      <td>gc</td>\n",
              "      <td>866.00</td>\n",
              "      <td>962.64</td>\n",
              "      <td>859.90</td>\n",
              "      <td>995.76</td>\n",
              "      <td>1135.50</td>\n",
              "      <td>199.69</td>\n",
              "      <td>151.72</td>\n",
              "      <td>553.34</td>\n",
              "      <td>505.60</td>\n",
              "      <td>237.88</td>\n",
              "      <td>7</td>\n",
              "      <td>25</td>\n",
              "      <td>6</td>\n",
              "      <td>ncostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>2</td>\n",
              "      <td>gc</td>\n",
              "      <td>727.93</td>\n",
              "      <td>758.89</td>\n",
              "      <td>796.82</td>\n",
              "      <td>894.69</td>\n",
              "      <td>968.90</td>\n",
              "      <td>168.66</td>\n",
              "      <td>168.66</td>\n",
              "      <td>520.24</td>\n",
              "      <td>486.49</td>\n",
              "      <td>199.19</td>\n",
              "      <td>5</td>\n",
              "      <td>60</td>\n",
              "      <td>5</td>\n",
              "      <td>scostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>3</td>\n",
              "      <td>gc</td>\n",
              "      <td>599.48</td>\n",
              "      <td>783.05</td>\n",
              "      <td>719.86</td>\n",
              "      <td>900.11</td>\n",
              "      <td>1048.30</td>\n",
              "      <td>165.58</td>\n",
              "      <td>137.80</td>\n",
              "      <td>439.06</td>\n",
              "      <td>404.74</td>\n",
              "      <td>171.47</td>\n",
              "      <td>4</td>\n",
              "      <td>65</td>\n",
              "      <td>2</td>\n",
              "      <td>ncostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>4</td>\n",
              "      <td>er</td>\n",
              "      <td>835.17</td>\n",
              "      <td>793.06</td>\n",
              "      <td>761.25</td>\n",
              "      <td>831.04</td>\n",
              "      <td>1048.70</td>\n",
              "      <td>180.88</td>\n",
              "      <td>147.14</td>\n",
              "      <td>483.00</td>\n",
              "      <td>425.22</td>\n",
              "      <td>222.95</td>\n",
              "      <td>2</td>\n",
              "      <td>50</td>\n",
              "      <td>4</td>\n",
              "      <td>scostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>5</td>\n",
              "      <td>er</td>\n",
              "      <td>755.59</td>\n",
              "      <td>846.29</td>\n",
              "      <td>858.86</td>\n",
              "      <td>985.64</td>\n",
              "      <td>883.05</td>\n",
              "      <td>174.91</td>\n",
              "      <td>138.90</td>\n",
              "      <td>404.41</td>\n",
              "      <td>389.52</td>\n",
              "      <td>178.49</td>\n",
              "      <td>2</td>\n",
              "      <td>25</td>\n",
              "      <td>6</td>\n",
              "      <td>valley</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>895</th>\n",
              "      <td>896</td>\n",
              "      <td>gc</td>\n",
              "      <td>766.39</td>\n",
              "      <td>877.71</td>\n",
              "      <td>751.59</td>\n",
              "      <td>869.78</td>\n",
              "      <td>942.70</td>\n",
              "      <td>142.61</td>\n",
              "      <td>136.21</td>\n",
              "      <td>474.48</td>\n",
              "      <td>420.65</td>\n",
              "      <td>203.00</td>\n",
              "      <td>6</td>\n",
              "      <td>20</td>\n",
              "      <td>4</td>\n",
              "      <td>mountn</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>896</th>\n",
              "      <td>897</td>\n",
              "      <td>gc</td>\n",
              "      <td>1128.50</td>\n",
              "      <td>1167.80</td>\n",
              "      <td>1047.60</td>\n",
              "      <td>1292.60</td>\n",
              "      <td>1297.10</td>\n",
              "      <td>207.40</td>\n",
              "      <td>213.77</td>\n",
              "      <td>705.36</td>\n",
              "      <td>551.61</td>\n",
              "      <td>243.76</td>\n",
              "      <td>7</td>\n",
              "      <td>45</td>\n",
              "      <td>7</td>\n",
              "      <td>scostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>897</th>\n",
              "      <td>898</td>\n",
              "      <td>gc</td>\n",
              "      <td>787.10</td>\n",
              "      <td>1055.20</td>\n",
              "      <td>842.79</td>\n",
              "      <td>1041.30</td>\n",
              "      <td>1064.80</td>\n",
              "      <td>175.05</td>\n",
              "      <td>141.63</td>\n",
              "      <td>478.86</td>\n",
              "      <td>448.61</td>\n",
              "      <td>254.51</td>\n",
              "      <td>5</td>\n",
              "      <td>60</td>\n",
              "      <td>7</td>\n",
              "      <td>scostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>898</th>\n",
              "      <td>899</td>\n",
              "      <td>gc</td>\n",
              "      <td>860.56</td>\n",
              "      <td>1081.30</td>\n",
              "      <td>799.76</td>\n",
              "      <td>1123.20</td>\n",
              "      <td>1218.20</td>\n",
              "      <td>211.04</td>\n",
              "      <td>151.31</td>\n",
              "      <td>495.20</td>\n",
              "      <td>401.56</td>\n",
              "      <td>246.48</td>\n",
              "      <td>5</td>\n",
              "      <td>50</td>\n",
              "      <td>6</td>\n",
              "      <td>scostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>899</th>\n",
              "      <td>900</td>\n",
              "      <td>gc</td>\n",
              "      <td>893.94</td>\n",
              "      <td>1119.90</td>\n",
              "      <td>967.88</td>\n",
              "      <td>1091.70</td>\n",
              "      <td>1387.50</td>\n",
              "      <td>175.80</td>\n",
              "      <td>180.11</td>\n",
              "      <td>518.68</td>\n",
              "      <td>458.53</td>\n",
              "      <td>245.13</td>\n",
              "      <td>2</td>\n",
              "      <td>65</td>\n",
              "      <td>4</td>\n",
              "      <td>ncostl</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>900 rows × 16 columns</p>\n",
              "</div>"
            ],
            "text/plain": [
              "     idcase depvar    ic.gc    ic.gr  ...  income  agehed  rooms  region\n",
              "0         1     gc   866.00   962.64  ...       7      25      6  ncostl\n",
              "1         2     gc   727.93   758.89  ...       5      60      5  scostl\n",
              "2         3     gc   599.48   783.05  ...       4      65      2  ncostl\n",
              "3         4     er   835.17   793.06  ...       2      50      4  scostl\n",
              "4         5     er   755.59   846.29  ...       2      25      6  valley\n",
              "..      ...    ...      ...      ...  ...     ...     ...    ...     ...\n",
              "895     896     gc   766.39   877.71  ...       6      20      4  mountn\n",
              "896     897     gc  1128.50  1167.80  ...       7      45      7  scostl\n",
              "897     898     gc   787.10  1055.20  ...       5      60      7  scostl\n",
              "898     899     gc   860.56  1081.30  ...       5      50      6  scostl\n",
              "899     900     gc   893.94  1119.90  ...       2      65      4  ncostl\n",
              "\n",
              "[900 rows x 16 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 9
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bma2TRUleTyy"
      },
      "source": [
        "### Reshape data"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "co24t943VCqW"
      },
      "source": [
        "The dataset is available in wide format. Since `xlogit` requires the data in long format, we convert it as shown below:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 424
        },
        "id": "Dd4hkXnQS7u-",
        "outputId": "8c96665d-c357-4299-9e24-453674a2c105"
      },
      "source": [
        "from xlogit.utils import wide_to_long\n",
        "df = wide_to_long(df_wide, id_col='idcase', alt_name='alt', varying=['ic', 'oc'],\n",
        "                  alt_list=['ec', 'er', 'gc', 'gr', 'hp'], sep='.', empty_val=0)\n",
        "df"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>idcase</th>\n",
              "      <th>alt</th>\n",
              "      <th>ic</th>\n",
              "      <th>oc</th>\n",
              "      <th>depvar</th>\n",
              "      <th>income</th>\n",
              "      <th>agehed</th>\n",
              "      <th>rooms</th>\n",
              "      <th>region</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>1</td>\n",
              "      <td>ec</td>\n",
              "      <td>859.90</td>\n",
              "      <td>553.34</td>\n",
              "      <td>gc</td>\n",
              "      <td>7</td>\n",
              "      <td>25</td>\n",
              "      <td>6</td>\n",
              "      <td>ncostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>er</td>\n",
              "      <td>995.76</td>\n",
              "      <td>505.60</td>\n",
              "      <td>gc</td>\n",
              "      <td>7</td>\n",
              "      <td>25</td>\n",
              "      <td>6</td>\n",
              "      <td>ncostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>1</td>\n",
              "      <td>gc</td>\n",
              "      <td>866.00</td>\n",
              "      <td>199.69</td>\n",
              "      <td>gc</td>\n",
              "      <td>7</td>\n",
              "      <td>25</td>\n",
              "      <td>6</td>\n",
              "      <td>ncostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>1</td>\n",
              "      <td>gr</td>\n",
              "      <td>962.64</td>\n",
              "      <td>151.72</td>\n",
              "      <td>gc</td>\n",
              "      <td>7</td>\n",
              "      <td>25</td>\n",
              "      <td>6</td>\n",
              "      <td>ncostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>1</td>\n",
              "      <td>hp</td>\n",
              "      <td>1135.50</td>\n",
              "      <td>237.88</td>\n",
              "      <td>gc</td>\n",
              "      <td>7</td>\n",
              "      <td>25</td>\n",
              "      <td>6</td>\n",
              "      <td>ncostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4495</th>\n",
              "      <td>900</td>\n",
              "      <td>ec</td>\n",
              "      <td>967.88</td>\n",
              "      <td>518.68</td>\n",
              "      <td>gc</td>\n",
              "      <td>2</td>\n",
              "      <td>65</td>\n",
              "      <td>4</td>\n",
              "      <td>ncostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4496</th>\n",
              "      <td>900</td>\n",
              "      <td>er</td>\n",
              "      <td>1091.70</td>\n",
              "      <td>458.53</td>\n",
              "      <td>gc</td>\n",
              "      <td>2</td>\n",
              "      <td>65</td>\n",
              "      <td>4</td>\n",
              "      <td>ncostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4497</th>\n",
              "      <td>900</td>\n",
              "      <td>gc</td>\n",
              "      <td>893.94</td>\n",
              "      <td>175.80</td>\n",
              "      <td>gc</td>\n",
              "      <td>2</td>\n",
              "      <td>65</td>\n",
              "      <td>4</td>\n",
              "      <td>ncostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4498</th>\n",
              "      <td>900</td>\n",
              "      <td>gr</td>\n",
              "      <td>1119.90</td>\n",
              "      <td>180.11</td>\n",
              "      <td>gc</td>\n",
              "      <td>2</td>\n",
              "      <td>65</td>\n",
              "      <td>4</td>\n",
              "      <td>ncostl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4499</th>\n",
              "      <td>900</td>\n",
              "      <td>hp</td>\n",
              "      <td>1387.50</td>\n",
              "      <td>245.13</td>\n",
              "      <td>gc</td>\n",
              "      <td>2</td>\n",
              "      <td>65</td>\n",
              "      <td>4</td>\n",
              "      <td>ncostl</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>4500 rows × 9 columns</p>\n",
              "</div>"
            ],
            "text/plain": [
              "      idcase alt       ic      oc depvar  income  agehed  rooms  region\n",
              "0          1  ec   859.90  553.34     gc       7      25      6  ncostl\n",
              "1          1  er   995.76  505.60     gc       7      25      6  ncostl\n",
              "2          1  gc   866.00  199.69     gc       7      25      6  ncostl\n",
              "3          1  gr   962.64  151.72     gc       7      25      6  ncostl\n",
              "4          1  hp  1135.50  237.88     gc       7      25      6  ncostl\n",
              "...      ...  ..      ...     ...    ...     ...     ...    ...     ...\n",
              "4495     900  ec   967.88  518.68     gc       2      65      4  ncostl\n",
              "4496     900  er  1091.70  458.53     gc       2      65      4  ncostl\n",
              "4497     900  gc   893.94  175.80     gc       2      65      4  ncostl\n",
              "4498     900  gr  1119.90  180.11     gc       2      65      4  ncostl\n",
              "4499     900  hp  1387.50  245.13     gc       2      65      4  ncostl\n",
              "\n",
              "[4500 rows x 9 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 10
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qV9gAmnHed2f"
      },
      "source": [
        "### Estimate the model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8ozNDq8CVmwy"
      },
      "source": [
        "We now import `MultinomialLogit` from xlogit and estimate the model as shown below:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "JJJL5oeq1rRh",
        "outputId": "7e03a602-6df0-4c30-c5bb-abd72c3fded7"
      },
      "source": [
        "from xlogit import MultinomialLogit\n",
        "\n",
        "varnames = ['ic', 'oc']\n",
        "model = MultinomialLogit()\n",
        "model.fit(X=df[varnames],\n",
        "          y=df['depvar'],\n",
        "          varnames=varnames,\n",
        "          alts=df['alt'],\n",
        "          ids=df['idcase'])\n",
        "model.summary()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Estimation time= 0.0 seconds\n",
            "---------------------------------------------------------------------------\n",
            "Coefficient              Estimate      Std.Err.         z-val         P>|z|\n",
            "---------------------------------------------------------------------------\n",
            "ic                     -0.0062318     0.0003516   -17.7222802      2.16e-59 ***\n",
            "oc                     -0.0045800     0.0003208   -14.2767999      9.14e-41 ***\n",
            "---------------------------------------------------------------------------\n",
            "Significance:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n",
            "\n",
            "Log-Likelihood= -1095.237\n",
            "AIC= 2194.474\n",
            "BIC= 2204.079\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "I1kynb9l4KUN"
      },
      "source": [
        "Note that these results are identical to the ones estimated by the R mlogit package: https://cran.r-project.org/web/packages/mlogit/vignettes/e1mlogit.html"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Aw_ONyTnigBI"
      },
      "source": [
        "## References"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KRn6BBFVihVO"
      },
      "source": [
        "- Bierlaire, M. (2018). PandasBiogeme: a short introduction. EPFL (Transport and Mobility Laboratory, ENAC).\n",
        "\n",
        "- Brathwaite, T., & Walker, J. L. (2018). Asymmetric, closed-form, finite-parameter models of multinomial choice. Journal of Choice Modelling, 29, 78–112. \n",
        "\n",
        "- Cameron, A. C., & Trivedi, P. K. (2005). Microeconometrics: methods and applications. Cambridge university press.\n",
        "\n",
        "- Croissant, Y. (2020). Estimation of Random Utility Models in R: The mlogit Package. Journal of Statistical Software, 95(1), 1-41.\n",
        "\n",
        "- Washington, S., Karlaftis, M., Mannering, F., & Anastasopoulos, P. (2020). Statistical and econometric methods for transportation data analysis. Chapman and Hall/CRC."
      ]
    }
  ]
}