The current model development aims to use a different approach. It is a mixed logit model with random prices for drugs and procedures. The dependent variables are physicians’ qualitative choices, among various combinations/considerations of choice sets. Independent variables combine product and patient characteristics, to analyze market adjustment of supply and demand. In the first-choice experiments for the baseline mixed logit, drug prices are product attributes, they are generated with random-numbers sequences and patients’ characteristics are case specific variables. This type of model is mainly coming from business research.
The base line model was implemented in a computer code, written by Dr J Lustig; the runs provide simulations of consumers’ choice sets on a limited number of product characteristics. Originally, the code was written for a case scenario of 1000 consumers, three product characteristics and combinations of ten choice sets. The implementation of this test requires a comparison between combinations of three alternatives and a comparison of a reduced choice set, removing one alternative out of the three. This paper describes the modifications from the baseline model for the application on the selected medical market case: diabetes type II, without complications. This first experiment includes only one choice set: a drug choice set, while the original code could allow up to ten choice sets. This case study on physicians’ choices, first needs to be described to ensure that program modifications to implement the new test, cover the various issues added to the baseline model :
The mixed logit model run with Stata, is not for consumers choices since the decision maker is the physician, who decides for or with the patient, different treatments and according to different patients’ clinical and socio economic characteristics (so both product and patient’ attributes)
Physicians ’choices are among alternatives on treatment choices for diabetic type II, this requires also decisions on classifications of drugs, ingredients, procedures for each alternative (usually dependent on clinical guidelines)
This case scenario is on drug choices only, and the choice set includes three alternatives, including the “no drug” alternative; the grouping is presented in the next section with descriptive statistics of the sample
Drug prices and the Interaction variable: Drug price for Medicare (the main federal public plan versus commercial plans) are random variables
Three patient characteristics: age, obesity and sex are case specific variables for this simulation
Therefore, the model for this medical market case includes two random variables on products and three variables on patients. The first baseline model from Dr Lustig only includes one random variable and the case specific variables are linked to the decision maker’s choices: the consumer. The random variables for this medical application are in log form contrary to the baseline model coded by Dr Lustig, who uses only normal distributions. The generation of a random variable for drug prices is also a critical step in the model development. The first working paper discusses the sources of drug price data and the selection of main common forms. This selection will have to be improved to use more complex price indexes in the model. In this paper, a generator of random numbers for prices is experimented:
The first formula for the generation of random price numbers provided by Prof J Hausman has been the following:
2*(0.05*Price) *uniform + price-(0.05*price) (a)
Usual formula for generation of random numbers use a uniform random generation, with originally distributed random variates over an interval (0,1). But in this formula for generation of random drug prices, the random generator for prices is modified to an interval (0,2), the 5% artificial random variation is selected as a first experiment and the formula is modified into a log form of the random variable (1). So, the formula used for generation of prices for this mixed logit model is the following:
Log (2*runiform () + price + 1). (b)
Formulas for random numbers were implemented in Stata 15C, The formula was first tested with the “asmixlogit” Stata command: “lnormal”, without the log form and only with the “runiform” command for the generation of random number over the selected intervals. However, the formula with the log form and the “runiform” command in it (b) lead to more statistically significant parameters than the first formula (a) used with the lnormal Stata command. The formula avoids negative numbers in the random variates and a price equals to zero in the dataset (it helps with the “no drug” alternative) (1). Additional series of random number functions may be needed in future steps of the model development, especially to investigate correlations issues between alternatives.
Several random variables have been tested for prices since this economic model aims to adjust demand models with product prices, usually with data from the supply side. At this stage of the research, the two random prices represent general price parameters for a drug selection  for this type of Type II diabetic patients without complications; however the second random price variable also aims to capture the price differences paid between categories of insurance plans (these differences result to a large extent from differences in discount practices, in addition to variations in the supply chains and dispensing modes).
Economists usually need to use a log form in demand models of care, to incorporate very skewed distributions (long tailed), partly due to age distribution of various patient groups. Distributions of epidemiological data, used for instance in models for disease progression, are also not normal (usual distributions are for instance Weibull, etc.).
The power sample of this dataset is sufficient to allow a comparison between public and private insurance: mainly between the federal public plan, Medicare and private plans’ categories (the classification into insurance categories from the CDC was part of a detailed analysis in previous runs to estimate effects of various cost sharing categories).
Therefore, at this stage of development, the model includes two random price variables:
one random drug price for each alternative (source: Redbook source)
one random interaction variable: drug price x Medicare (the control group is in this case the private insurance category, other models in the future may add other categories such as Medicare advantage, Medicaid, dual eligible etc).
Mixed logit models (e.g. run with Stata software) are very used in the USA, especially since the health law passed under the Obama administration, with the creation of information exchanges on insurance plans. They are mainly used for comparative analysis of health insurance plans, using case specific variables for main types of health plans and random variables for plan characteristics such as deductibles, copayments types (e.g. tier copays) …. Such analysis helps to represent net price paid by patients or proxies for payment arrangements; they are often used by economists, to analyze the demand side of a medical market. However, in this paper, the random price variable in this economic model is for price adjustment on specific medical markets ; the categorical variables for insurance in the CDC survey are used to control for major differences of drug prices per main types of insurance plans, especially public versus private. This can help to examine whether there is a major difference in the coverage of patients under commercial plans and under Medicare, for age groups before 65 (especially between 55 and 65) and after 65. As this kind of modeling seems to be reliable in the statistical runs presented in this paper, it may be developed with additional types of insurance profiles and with more complex price indices for each plan category in further research. Main relevant price data providers have also been approached, a negotiation has started with Iqvia for special legacy of Pharmerit datasets (however, for the sampled case on diabetes, this dataset under-represents Medicare enrollees). Medicaid databases per state and IBM/Redbook were also consulted as possible sources. Additional sources used in main international price studies have also been reviewed and may be used for comparative analysis [17–20].
Therefore, this medical application to implement the generalization of the independence of irrelevant alternatives test and his recent specification  is proposed for a Mixed Logit Model including random variables for drug prices, with control variables for price differences per main type of insurance plans and patient variables, representing demographics and risk factors. At this stage, the model only includes three patient variables: age, sex and obesity. The variable age is a continuous variable, for adults over age 39 (as in previous predictive disease models already run), the cutoff point 39 is from the Diabetes clinical guidelines at the time of the study . Obesity is a categorical variable in the CDC survey instrument. For the random variables, two types of sequences have been used for the simulations: Halton and Hammersley sequences; Hammersley sequence points may also be an alternative to random numbers, they seem to significantly improve efficiency for some simulations (e.g. Monte Carlo simulation, cited in Stata Docs . However, only the Halton sequence could be included at this point: a user command of Stata was needed to input the parameters of the mixed logit models, for the implementation of the new specification tests, written in a Matlab code, for the application of this code on the diabetic dataset. Stata runs with this user command, were only available with an Halton sequence by default.