MixedLogit¶
Implements all the logic for mixed logit models.
- class xlogit.mixed_logit.MixedLogit¶
Class for estimation of Mixed Logit Models.
- coeff_¶
Estimated coefficients
- Type
numpy array, shape (n_variables + n_randvars, )
- coeff_names¶
Names of the estimated coefficients
- Type
numpy array, shape (n_variables + n_randvars, )
- stderr¶
Standard errors of the estimated coefficients
- Type
numpy array, shape (n_variables + n_randvars, )
- zvalues¶
Z-values for t-distribution of the estimated coefficients
- Type
numpy array, shape (n_variables + n_randvars, )
- pvalues¶
P-values of the estimated coefficients
- Type
numpy array, shape (n_variables + n_randvars, )
- loglikelihood¶
Log-likelihood at the end of the estimation
- Type
float
- convergence¶
Whether convergence was reached during estimation
- Type
bool
- total_iter¶
Total number of iterations executed during estimation
- Type
int
- estim_time_sec¶
Estimation time in seconds
- Type
float
- sample_size¶
Number of samples used for estimation
- Type
int
- aic¶
Akaike information criteria of the estimated model
- Type
float
- bic¶
Bayesian information criteria of the estimated model
- Type
float
- fit(X, y, varnames, alts, ids, randvars, isvars=None, weights=None, avail=None, panels=None, base_alt=None, fit_intercept=False, init_coeff=None, maxiter=2000, random_state=None, n_draws=1000, halton=True, verbose=1, batch_size=None, halton_opts=None, tol_opts=None, robust=False, num_hess=False, scale_factor=None, optim_method='BFGS', mnl_init=True, addit=None, skip_std_errs=False)¶
Fit Mixed Logit models.
- Parameters
X (array-like, shape (n_samples*n_alts, n_variables)) – Input data for explanatory variables in long format
y (array-like, shape (n_samples*n_alts,)) – Chosen alternatives or one-hot encoded representation of the choices
varnames (list-like, shape (n_variables,)) – Names of explanatory variables that must match the number and order of columns in
Xalts (array-like, shape (n_samples*n_alts,)) – Alternative values in long format
ids (array-like, shape (n_samples*n_alts,)) – Identifiers for the samples in long format.
randvars (dict) – Names (keys) and mixing distributions (values) of variables that have random parameters as coefficients. Possible mixing distributions are:
'n': normal,'ln': lognormal,'u': uniform,'t': triangular,'tn': truncated normalisvars (list-like) – Names of individual-specific variables in
varnamesweights (array-like, shape (n_samples,), default=None) – Sample weights in long format.
avail (array-like, shape (n_samples*n_alts,), default=None) – Availability of alternatives for the choice situations. One when available or zero otherwise.
panels (array-like, shape (n_samples*n_alts,), default=None) – Identifiers in long format to create panels in combination with
idsbase_alt (int, float or str, default=None) – Base alternative
fit_intercept (bool, default=False) – Whether to include an intercept in the model.
init_coeff (numpy array, shape (n_variables,), default=None) – Initial coefficients for estimation.
maxiter (int, default=200) – Maximum number of iterations
random_state (int, default=None) – Random seed for numpy random generator
n_draws (int, default=500) – Number of random draws to approximate the mixing distributions of the random coefficients
halton (bool, default=True) – Whether the estimation uses halton draws.
halton_opts (dict, default=None) –
Options for generation of halton draws. The dictionary accepts the following options (keys):
- shufflebool, default=False
Whether the Halton draws should be shuffled
- dropint, default=100
Number of initial Halton draws to discard to minimize correlations between Halton sequences
- primeslist
List of primes to be used as base for generation of Halton sequences.
tol_opts (dict, default=None) –
Options for tolerance of optimization routine. The dictionary accepts the following options (keys):
- ftolfloat, default=1e-10
Tolerance for objective function (log-likelihood)
- gtolfloat, default=1e-5
Tolerance for gradient function.
verbose (int, default=1) – Verbosity of messages to show during estimation. 0: No messages, 1: Some messages, 2: All messages
batch_size (int, default=None) – Size of batches used to avoid GPU memory overflow.
scale_factor (array-like, shape (n_samples*n_alts, ), default=None) – Scaling variable used for non-linear models. For WTP models, this is usually the negative of the price variable.
addit (array-like, shape (n_samples*n_alts, ), default=None) – Additive term to model coefficients kept fixed during estimation.
optim_method (str, default='BFGS') – Optimization method to use for model estimation. It can be BFGS or L-BFGS-B. For non-linear (WTP-like) models, L-BFGS-B is used by default.
robust (bool, default=False) – Whether robust standard errors should be computed
num_hess (bool, default=False) – Whether numerical hessian should be used for estimation of standard errors
skip_std_errs (bool, default=False) – Whether estimation of standard errors should be skipped
mnl_init (bool, default=True) – Whether to initialize coefficients using estimates from a multinomial logit
- Returns
- Return type
None.
- predict(X, varnames, alts, ids, isvars=None, weights=None, avail=None, panels=None, random_state=None, n_draws=1000, halton=True, verbose=1, batch_size=None, return_proba=False, return_freq=False, halton_opts=None, scale_factor=None, addit=None)¶
Predict chosen alternatives.
- Parameters
X (array-like, shape (n_samples*n_alts, n_variables)) – Input data for explanatory variables in long format
varnames (list, shape (n_variables,)) – Names of explanatory variables that must match the number and order of columns in
Xalts (array-like, shape (n_samples*n_alts,)) – Alternative values in long format
ids (array-like, shape (n_samples*n_alts,)) – Identifiers for the samples in long format.
isvars (list) – Names of individual-specific variables in
varnamesweights (array-like, shape (n_variables,), default=None) – Sample weights in long format.
avail (array-like, shape (n_samples*n_alts,), default=None) – Availability of alternatives for the samples. One when available or zero otherwise.
panels (array-like, shape (n_samples*n_alts,), default=None) – Identifiers in long format to create panels in combination with
idsrandom_state (int, default=None) – Random seed for numpy random generator
n_draws (int, default=200) – Number of random draws to approximate the mixing distributions of the random coefficients
halton (bool, default=True) – Whether the estimation uses halton draws.
halton_opts (dict, default=None) –
Options for generation of Halton draws. The dictionary accepts the following options (keys):
- shufflebool, default=False
Whether the Halton draws should be shuffled
- dropint, default=100
Number of initial Halton draws to discard to minimize correlations between Halton sequences
- primeslist
List of primes to be used as base for generation of Halton sequences.
verbose (int, default=1) – Verbosity of messages to show during estimation. 0: No messages, 1: Some messages, 2: All messages
batch_size (int, default=None) – Size of batches used to GPU avoid memory overflow.
scale_factor (array-like, shape (n_samples*n_alts, ), default=None) – Scaling variable used for non-linear WTP-like models. This is usually the negative of the price variable..
addit (array-like, shape (n_samples*n_alts, ), default=None) – Additive term to model coefficients kept fixed during estimation.
return_proba (bool, default=False) – If True, also return the choice probabilities
return_freq (bool, default=False) – If True, also return the frequency of the chosen the alternatives
- Returns
choices (array-like, shape (n_samples, )) – Chosen alternative for every sample in the dataset.
proba (array-like, shape (n_samples, n_alts), optional) – Choice probabilities for each sample in the dataset. The alternatives are ordered (in the columns) as they appear in
self.alternatives. Only provided if return_proba is True.freq (dict, optional) – Choice frequency for each alternative. Only provided if return_freq is True.
- summary()¶
Show estimation results in console.
- static check_if_gpu_available()¶
Check if GPU processing is available by running a quick estimation.
- Returns
True if GPU processing is available, False otherwise.
- Return type
bool
- xlogit.mixed_logit.batches_idx(batch_size, n_samples)¶