MixedLogit¶

Implements all the logic for mixed logit models.

class xlogit.mixed_logit.MixedLogit¶

Class for estimation of Mixed Logit Models.

coeff_¶

Estimated coefficients

Type: numpy array, shape (n_variables + n_randvars, )

coeff_names¶

Names of the estimated coefficients

Type: numpy array, shape (n_variables + n_randvars, )

stderr¶

Standard errors of the estimated coefficients

Type: numpy array, shape (n_variables + n_randvars, )

zvalues¶

Z-values for t-distribution of the estimated coefficients

Type: numpy array, shape (n_variables + n_randvars, )

pvalues¶

P-values of the estimated coefficients

Type: numpy array, shape (n_variables + n_randvars, )

loglikelihood¶

Log-likelihood at the end of the estimation

Type: float

convergence¶

Whether convergence was reached during estimation

Type: bool

total_iter¶

Total number of iterations executed during estimation

Type: int

estim_time_sec¶

Estimation time in seconds

Type: float

sample_size¶

Number of samples used for estimation

Type: int

aic¶

Akaike information criteria of the estimated model

Type: float

bic¶

Bayesian information criteria of the estimated model

Type: float

fit(X, y, varnames, alts, ids, randvars, isvars=None, weights=None, avail=None, panels=None, base_alt=None, fit_intercept=False, init_coeff=None, maxiter=2000, random_state=None, n_draws=1000, halton=True, verbose=1, batch_size=None, halton_opts=None, tol_opts=None, robust=False, num_hess=False, scale_factor=None, optim_method='BFGS', mnl_init=True, addit=None, skip_std_errs=False)¶

Fit Mixed Logit models.

Parameters

X (array-like, shape (n_samples*n_alts, n_variables)) – Input data for explanatory variables in long format
y (array-like, shape (n_samples*n_alts,)) – Chosen alternatives or one-hot encoded representation of the choices
varnames (list-like, shape (n_variables,)) – Names of explanatory variables that must match the number and order of columns in X
alts (array-like, shape (n_samples*n_alts,)) – Alternative values in long format
ids (array-like, shape (n_samples*n_alts,)) – Identifiers for the samples in long format.
randvars (dict) – Names (keys) and mixing distributions (values) of variables that have random parameters as coefficients. Possible mixing distributions are: 'n': normal, 'ln': lognormal, 'u': uniform, 't': triangular, 'tn': truncated normal
isvars (list-like) – Names of individual-specific variables in varnames
weights (array-like, shape (n_samples,), default=None) – Sample weights in long format.
avail (array-like, shape (n_samples*n_alts,), default=None) – Availability of alternatives for the choice situations. One when available or zero otherwise.
panels (array-like, shape (n_samples*n_alts,), default=None) – Identifiers in long format to create panels in combination with ids
base_alt (int, float or str, default=None) – Base alternative
fit_intercept (bool, default=False) – Whether to include an intercept in the model.
init_coeff (numpy array, shape (n_variables,), default=None) – Initial coefficients for estimation.
maxiter (int, default=200) – Maximum number of iterations
random_state (int, default=None) – Random seed for numpy random generator
n_draws (int, default=500) – Number of random draws to approximate the mixing distributions of the random coefficients
halton (bool, default=True) – Whether the estimation uses halton draws.
halton_opts (dict, default=None) –
Options for generation of halton draws. The dictionary accepts the following options (keys):

shufflebool, default=False
Whether the Halton draws should be shuffled

dropint, default=100
Number of initial Halton draws to discard to minimize correlations between Halton sequences

primeslist
List of primes to be used as base for generation of Halton sequences.
tol_opts (dict, default=None) –
Options for tolerance of optimization routine. The dictionary accepts the following options (keys):

ftolfloat, default=1e-10
Tolerance for objective function (log-likelihood)

gtolfloat, default=1e-5
Tolerance for gradient function.
verbose (int, default=1) – Verbosity of messages to show during estimation. 0: No messages, 1: Some messages, 2: All messages
batch_size (int, default=None) – Size of batches used to avoid GPU memory overflow.
scale_factor (array-like, shape (n_samples*n_alts, ), default=None) – Scaling variable used for non-linear models. For WTP models, this is usually the negative of the price variable.
addit (array-like, shape (n_samples*n_alts, ), default=None) – Additive term to model coefficients kept fixed during estimation.
optim_method (str, default='BFGS') – Optimization method to use for model estimation. It can be BFGS or L-BFGS-B. For non-linear (WTP-like) models, L-BFGS-B is used by default.
robust (bool, default=False) – Whether robust standard errors should be computed
num_hess (bool, default=False) – Whether numerical hessian should be used for estimation of standard errors
skip_std_errs (bool, default=False) – Whether estimation of standard errors should be skipped
mnl_init (bool, default=True) – Whether to initialize coefficients using estimates from a multinomial logit

Returns

Return type

None.

predict(X, varnames, alts, ids, isvars=None, weights=None, avail=None, panels=None, random_state=None, n_draws=1000, halton=True, verbose=1, batch_size=None, return_proba=False, return_freq=False, halton_opts=None, scale_factor=None, addit=None)¶

Predict chosen alternatives.

Parameters

X (array-like, shape (n_samples*n_alts, n_variables)) – Input data for explanatory variables in long format
varnames (list, shape (n_variables,)) – Names of explanatory variables that must match the number and order of columns in X
alts (array-like, shape (n_samples*n_alts,)) – Alternative values in long format
ids (array-like, shape (n_samples*n_alts,)) – Identifiers for the samples in long format.
isvars (list) – Names of individual-specific variables in varnames
weights (array-like, shape (n_variables,), default=None) – Sample weights in long format.
avail (array-like, shape (n_samples*n_alts,), default=None) – Availability of alternatives for the samples. One when available or zero otherwise.
panels (array-like, shape (n_samples*n_alts,), default=None) – Identifiers in long format to create panels in combination with ids
random_state (int, default=None) – Random seed for numpy random generator
n_draws (int, default=200) – Number of random draws to approximate the mixing distributions of the random coefficients
halton (bool, default=True) – Whether the estimation uses halton draws.
halton_opts (dict, default=None) –
Options for generation of Halton draws. The dictionary accepts the following options (keys):

shufflebool, default=False
Whether the Halton draws should be shuffled

dropint, default=100
Number of initial Halton draws to discard to minimize correlations between Halton sequences

primeslist
List of primes to be used as base for generation of Halton sequences.
verbose (int, default=1) – Verbosity of messages to show during estimation. 0: No messages, 1: Some messages, 2: All messages
batch_size (int, default=None) – Size of batches used to GPU avoid memory overflow.
scale_factor (array-like, shape (n_samples*n_alts, ), default=None) – Scaling variable used for non-linear WTP-like models. This is usually the negative of the price variable..
addit (array-like, shape (n_samples*n_alts, ), default=None) – Additive term to model coefficients kept fixed during estimation.
return_proba (bool, default=False) – If True, also return the choice probabilities
return_freq (bool, default=False) – If True, also return the frequency of the chosen the alternatives

Returns

choices (array-like, shape (n_samples, )) – Chosen alternative for every sample in the dataset.
proba (array-like, shape (n_samples, n_alts), optional) – Choice probabilities for each sample in the dataset. The alternatives are ordered (in the columns) as they appear in self.alternatives. Only provided if return_proba is True.
freq (dict, optional) – Choice frequency for each alternative. Only provided if return_freq is True.

summary()¶: Show estimation results in console.

static check_if_gpu_available()¶

Check if GPU processing is available by running a quick estimation.

Returns: True if GPU processing is available, False otherwise.
Return type: bool

xlogit.mixed_logit.batches_idx(batch_size, n_samples)¶