2.1. Study Design:
This study was performed in the Tehran province, the capital of Iran, to compare the mean individual outpatient and inpatient utilization rates in physicians (informed consumer) and non-physicians (uninformed consumer) after controlling for conditions with probable impact on health care need and utilization.
Study population consisted of the physicians and non-physicians of the Tehran province who were randomly chosen. Subjects who did not answer the phone call and were unwilling to cooperate. In non-physicians, subjects who had at least one physician in their close family (parents, children, siblings, spouse) or were a medical intern, were excluded as well, because their health information, and in result healthcare utilization, is probably different of other non-physicians.
2.3. Data Collection
Variables were as followings
1) Demographic: age, gender, occupation, education, marital status, living region, and income (low, lower middle, upper middle, and high based on the current currency of the country);
2) Insurance coverage: basic insurances in Iran consist of social insurances that covers more than 90% of Iranian population.
3) Access to healthcare: the least time required for achieving the nearest healthcare provision center, and number of people living permanently in the household;
4) Health status: self-perceived health status (great, good, not bad, bad) and presence of a chronic condition (including any of the following conditions for at least 3 months that is diagnosed by a physician; any metabolic, cardiovascular disease, any kind of disability, hematologic disease, psychiatric disease, chronic infection, rheumatologic disorder, or other specified chronic disease);
We expected that age, sex, income, marital status, insurance coverage, and impaired health have a positive effect on utilization rate, opposing to minutes to the nearest healthcare center and number of people living in the household (by reducing the monetary and non-monetary sources allocated to each person in the family) which reflect an impaired access to healthcare. [5, 21, 22]
Outpatient and inpatient utilization rates were outcomes of interest. Outpatient utilization per patient consisted of number of imaging and laboratory services, physiotherapy sessions, and medicine purchasing with or without prescription in the last month multiplied by 12 besides number of outpatient surgeries, pap smear or mammography, and endoscopies and colonoscopies in the last year for each patient. Because of physicians bring able to purchase medicines without any prescription in Iran, we included medicine purchasing without prescription as well. General physician visit, specialists visits, outpatient emergency room visits, and injections were not calculated in the final outpatient utilization rate, because many physicians perform self-prescription  and including them in the final outpatient utilization rate might have resulted in an overestimated difference. Inpatient utilization consisted of number of admissions in the last year per patient. Being admitted was defined as an official inhabitancy in a hospital for at least 6 hours based on Iran Health Insurance Organization definition.  For admissions, patients were asked to report how many times they have been admitted in the last year in any of the following wards; surgery, internal medicine, obstetrics and gynecology, emergency room, Intensive Care Unit or Coronary Care Unit, or any other ward.
2.4. Data Collection Tool
The questionnaire used in this study was developed for this study (supplementary file 1); The telephone-survey questionnaire consisted of 21 questions and was designed based on the list of the services that are prone to be induced by a supplier, and therefore different in physicians and non-physicians, according to previous literature [25, 26] and experts’ opinion. Possible individual characteristics that could affect healthcare utilization were also included as explained in the previous section. The questionnaire was face-validated and content-validated in face to face interviews with 5 experts and 5 physicians and 5 non-physicians with university degree. The pilot phase was performed in 21 individuals (11 physicians and 10 non-physicians) and the Cronbach’s alpha based on a 10-day-apart test-retest was 0.845.  Additionally, the Pearson correlation coefficient was calculated for each variable; all were perfectly correlated except self-perceived health (0.79, p-value < 0.05).
2.5. Sampling Method
To our knowledge, there were no studies specifically reporting the incidence rate ratio in physicians vs non-physicians, therefore, we were unable to calculate the sample size based on the relevant formula. Thus, we utilized the nearest sample size of the most methodologically similar study to ours.  Subjects were chosen by simple random selection with replacement from the relevant phone number list. Phone number lists were extracted from the Tehran province phone number list categorized by occupation. Physicians were chosen as general practitioners and specialists. Non-physicians were chosen as subjects who had any kind of job requiring university degree; including psychologists, lawyers, statisticians, accountants, etc. Further control questions examining the individual education and occupation were included in the questionnaire, but not inserted in the final model for analysis. Physician and non-physicians were matched on their living area based on the first 2 digits of their phone number, due to different socioeconomical status of different living areas in Tehran. In details, for every physician who was interviewed, among the non-physicians’ phone number lists, the ones who had a phone number that was the same in the first two digit as the interviewed physician, non-physicians were contacted till they had the inclusion criteria and they were willing to cooperate.
2.6. Statistical Analysis
All baseline characteristics of the populations are reported in frequency of subgroups for categorizing variables and mean and standard deviation for numeric variables. The inpatient and outpatient utilization mean rates are reported in different subgroups of the participants as well.
For further analysis, as the outcome was count measure, we chose regression models that are specified for count data. In further analysis of the outpatient utilization rate, as we wanted to incorporate the over-dispersion (mean and variance of the distribution were not equal), we used negative binomial model. In the analysis of inpatient utilization rate, we encountered excess zeros (the frequency of not being admitted in the last year was more than 50%) beside over-dispersion, and according to the nature of the data, we utilized hurdle negative binomial model.
To compare whether the outcome of hurdle regression model was different from regular negavive binomial model, we performed vuong test to the model outputs (p-value < 0.05). All variables, except education, occupation, and living area because of being control variables of the inclusion criteria, were inserted in the final model.  For sensitivity analysis, all analysis was done excluding outpatient surgery; because its nature is different of other outpatient services, there is a high probability that the main proportion of outpatient surgeries are cosmetic procedures, and finally, the outpatient surgeries that were utilized by physicians might have been inpatient type of surgeries but utilized as an outpatient service because physicians avoided to be admitted. Besides, the analysis were reported with exclusion of office visits and injections, that are probable to be performed by physicians themselves. Moreover, obstetrics and gynecology admissions were omitted from inpatient utilization, because they are mostly provided for delivery and cannot be induced or neglected. Finally, subjects who did not report any utilization, whether outpatient or inpatient, were excluded from the analysis to reduce the impact of probable underutilization in physicians on the final comparison. Utilizing the “MatchIt” and “Zelig”[32, 33] packages in R for nearest neighbor matching, we also matched on the residence area before the regression results and the results did not differ significantly (each individual could be matched to more than one individual in the matching population). The general patterns remained the same for each step of sensitivity analysis, therefore, the details are not reported in this article.