MixedLogit

Implements all the logic for mixed logit models.

class xlogit.mixed_logit.MixedLogit

Class for estimation of Mixed Logit Models.

coeff_

Estimated coefficients

Type

numpy array, shape (n_variables + n_randvars, )

coeff_names

Names of the estimated coefficients

Type

numpy array, shape (n_variables + n_randvars, )

stderr

Standard errors of the estimated coefficients

Type

numpy array, shape (n_variables + n_randvars, )

zvalues

Z-values for t-distribution of the estimated coefficients

Type

numpy array, shape (n_variables + n_randvars, )

pvalues

P-values of the estimated coefficients

Type

numpy array, shape (n_variables + n_randvars, )

loglikelihood

Log-likelihood at the end of the estimation

Type

float

convergence

Whether convergence was reached during estimation

Type

bool

total_iter

Total number of iterations executed during estimation

Type

int

estim_time_sec

Estimation time in seconds

Type

float

sample_size

Number of samples used for estimation

Type

int

aic

Akaike information criteria of the estimated model

Type

float

bic

Bayesian information criteria of the estimated model

Type

float

fit(X, y, varnames, alts, ids, randvars, isvars=None, weights=None, avail=None, panels=None, base_alt=None, fit_intercept=False, init_coeff=None, maxiter=2000, random_state=None, n_draws=1000, halton=True, verbose=1, batch_size=None, halton_opts=None, tol_opts=None, robust=False, num_hess=False, scale_factor=None, optim_method='BFGS', mnl_init=True, addit=None, skip_std_errs=False)

Fit Mixed Logit models.

Parameters
  • X (array-like, shape (n_samples*n_alts, n_variables)) – Input data for explanatory variables in long format

  • y (array-like, shape (n_samples*n_alts,)) – Chosen alternatives or one-hot encoded representation of the choices

  • varnames (list-like, shape (n_variables,)) – Names of explanatory variables that must match the number and order of columns in X

  • alts (array-like, shape (n_samples*n_alts,)) – Alternative values in long format

  • ids (array-like, shape (n_samples*n_alts,)) – Identifiers for the samples in long format.

  • randvars (dict) – Names (keys) and mixing distributions (values) of variables that have random parameters as coefficients. Possible mixing distributions are: 'n': normal, 'ln': lognormal, 'u': uniform, 't': triangular, 'tn': truncated normal

  • isvars (list-like) – Names of individual-specific variables in varnames

  • weights (array-like, shape (n_samples,), default=None) – Sample weights in long format.

  • avail (array-like, shape (n_samples*n_alts,), default=None) – Availability of alternatives for the choice situations. One when available or zero otherwise.

  • panels (array-like, shape (n_samples*n_alts,), default=None) – Identifiers in long format to create panels in combination with ids

  • base_alt (int, float or str, default=None) – Base alternative

  • fit_intercept (bool, default=False) – Whether to include an intercept in the model.

  • init_coeff (numpy array, shape (n_variables,), default=None) – Initial coefficients for estimation.

  • maxiter (int, default=200) – Maximum number of iterations

  • random_state (int, default=None) – Random seed for numpy random generator

  • n_draws (int, default=500) – Number of random draws to approximate the mixing distributions of the random coefficients

  • halton (bool, default=True) – Whether the estimation uses halton draws.

  • halton_opts (dict, default=None) –

    Options for generation of halton draws. The dictionary accepts the following options (keys):

    shufflebool, default=False

    Whether the Halton draws should be shuffled

    dropint, default=100

    Number of initial Halton draws to discard to minimize correlations between Halton sequences

    primeslist

    List of primes to be used as base for generation of Halton sequences.

  • tol_opts (dict, default=None) –

    Options for tolerance of optimization routine. The dictionary accepts the following options (keys):

    ftolfloat, default=1e-10

    Tolerance for objective function (log-likelihood)

    gtolfloat, default=1e-5

    Tolerance for gradient function.

  • verbose (int, default=1) – Verbosity of messages to show during estimation. 0: No messages, 1: Some messages, 2: All messages

  • batch_size (int, default=None) – Size of batches used to avoid GPU memory overflow.

  • scale_factor (array-like, shape (n_samples*n_alts, ), default=None) – Scaling variable used for non-linear models. For WTP models, this is usually the negative of the price variable.

  • addit (array-like, shape (n_samples*n_alts, ), default=None) – Additive term to model coefficients kept fixed during estimation.

  • optim_method (str, default='BFGS') – Optimization method to use for model estimation. It can be BFGS or L-BFGS-B. For non-linear (WTP-like) models, L-BFGS-B is used by default.

  • robust (bool, default=False) – Whether robust standard errors should be computed

  • num_hess (bool, default=False) – Whether numerical hessian should be used for estimation of standard errors

  • skip_std_errs (bool, default=False) – Whether estimation of standard errors should be skipped

  • mnl_init (bool, default=True) – Whether to initialize coefficients using estimates from a multinomial logit

Returns

Return type

None.

predict(X, varnames, alts, ids, isvars=None, weights=None, avail=None, panels=None, random_state=None, n_draws=1000, halton=True, verbose=1, batch_size=None, return_proba=False, return_freq=False, halton_opts=None, scale_factor=None, addit=None)

Predict chosen alternatives.

Parameters
  • X (array-like, shape (n_samples*n_alts, n_variables)) – Input data for explanatory variables in long format

  • varnames (list, shape (n_variables,)) – Names of explanatory variables that must match the number and order of columns in X

  • alts (array-like, shape (n_samples*n_alts,)) – Alternative values in long format

  • ids (array-like, shape (n_samples*n_alts,)) – Identifiers for the samples in long format.

  • isvars (list) – Names of individual-specific variables in varnames

  • weights (array-like, shape (n_variables,), default=None) – Sample weights in long format.

  • avail (array-like, shape (n_samples*n_alts,), default=None) – Availability of alternatives for the samples. One when available or zero otherwise.

  • panels (array-like, shape (n_samples*n_alts,), default=None) – Identifiers in long format to create panels in combination with ids

  • random_state (int, default=None) – Random seed for numpy random generator

  • n_draws (int, default=200) – Number of random draws to approximate the mixing distributions of the random coefficients

  • halton (bool, default=True) – Whether the estimation uses halton draws.

  • halton_opts (dict, default=None) –

    Options for generation of Halton draws. The dictionary accepts the following options (keys):

    shufflebool, default=False

    Whether the Halton draws should be shuffled

    dropint, default=100

    Number of initial Halton draws to discard to minimize correlations between Halton sequences

    primeslist

    List of primes to be used as base for generation of Halton sequences.

  • verbose (int, default=1) – Verbosity of messages to show during estimation. 0: No messages, 1: Some messages, 2: All messages

  • batch_size (int, default=None) – Size of batches used to GPU avoid memory overflow.

  • scale_factor (array-like, shape (n_samples*n_alts, ), default=None) – Scaling variable used for non-linear WTP-like models. This is usually the negative of the price variable..

  • addit (array-like, shape (n_samples*n_alts, ), default=None) – Additive term to model coefficients kept fixed during estimation.

  • return_proba (bool, default=False) – If True, also return the choice probabilities

  • return_freq (bool, default=False) – If True, also return the frequency of the chosen the alternatives

Returns

  • choices (array-like, shape (n_samples, )) – Chosen alternative for every sample in the dataset.

  • proba (array-like, shape (n_samples, n_alts), optional) – Choice probabilities for each sample in the dataset. The alternatives are ordered (in the columns) as they appear in self.alternatives. Only provided if return_proba is True.

  • freq (dict, optional) – Choice frequency for each alternative. Only provided if return_freq is True.

summary()

Show estimation results in console.

static check_if_gpu_available()

Check if GPU processing is available by running a quick estimation.

Returns

True if GPU processing is available, False otherwise.

Return type

bool

xlogit.mixed_logit.batches_idx(batch_size, n_samples)