Sample Selection
A convenience sample of 3 hospitals with emergency departments located in densely populated areas in the city of Beirut was chosen. The city of Beirut was chosen because of the availability of air pollutant level monitoring performed in previous research studies [26]. The selected hospitals, Hotel Dieu de France (HDF), Saint Georges University Medical Center (SGHUMC) and Makassed Hospital, were chosen based on their location close to where pollutant monitoring was performed, the presence of an ED and their reception of large volumes of patients. The sample was selected from a larger existing sample collected for the Beirut Air Pollution and Health Effects (BAPHE) study performed by Nakhle et al.
Data Collection
At the time of data collection, Lebanese health institutions relied primarily on paper-based records. ED records for the years 2012, 2013 and 2014 were obtained from the hospitals’ archives. IRB approvals to access patient records were obtained from the respective institutions. From each record, the following information was collected by trained hospital residents: patient’s age, sex, date of presentation, chief complaint, differential diagnosis, final diagnosis, medications, and the name of the treating physician. This process is explained in detail in Nakhle et al.’s study entitled Beirut Air Pollution and Health Effects—BAPHE study protocol and objectives [26]. The data was then entered into Microsoft Excel, inspected by the principal investigator, and validated by two senior physicians/epidemiologists who were part of the research team.
Description of the Database
It was decided to sort the information into three major groups from hospital records identified as relevant by the study design. Logistic information contained the appropriate information required to identify the patient’s admissions record. This included the patient’s file number, the date of presentation to the ED and the hospital from which the record was obtained. Demographic information was also collected and included the patient’s age and sex. Lastly, health information was extracted from the hospital record. The information extracted included the initial complaint, the differential diagnosis, the final diagnosis, admission/discharge history and the administered medications.
Table 1 summarizes the variables chosen for this study.
Description of Date of Presentation
The patient’s date of presentation was entered using the format DD/MM/YYYY, given the versatility of this format in time-series analysis. Sorting through the dates present in the database revealed benign clerical errors that were easily corrected. For this study, the dates chosen range between 1/1/2012 and 31/12/2014. Patients who presented prior to 1/1/2012 or after 31/12/2014 were not included in the study. For entries with missing dates, “NA” was entered. Given that the purpose of the study is to code and categorize entries based on present health-related information, entries that lacked a date of presentation were not excluded The remainder of the information was studied and used to develop the coding and categorization algorithm.
To demonstrate when patients present themselves to the ED, counts were performed on a daily, monthly, seasonally, and yearly basis. For seasons, the following definitions were used:
- Winter: 1/1/201X - 20/3/201X
- Spring: 21/3/201X - 20/6/201X
- Summer: 21/6/201X - 20/9/201X
- Autumn: 21/9/201X - 31/12/201X
(X being the appropriate integer to be inserted based on the relevant years and dates selected for the study)
Categorization by Age
As described in Table 2, age groups were defined according to the WHO recommended age groups for studying health, health services and nutrition [27]. When age was not available in the database, the code “NA” was entered.
Categorization by Gender
Two genders were described and given appropriate codes for this study. Males were given the code “M” whereas females were given the code “F”. Missing values were coded as “NA”.
Categorization by Disease
Generally speaking, health-related information is recorded in each medical record sequentially starting with the chief complaint to finally reach a diagnosis and management plan. Similarly, to how a health record contains and categorizes existing data, the extracted health-related information is segregated into several categories. First, the chief complaint of the patient during is recorded as documented in the record. Second, the differential diagnosis based on the given presentation is recorded. When available, the final diagnosis is recorded as well as the medications used to manage the patient at the time of presentation and the name of the treating physician.
Several pathologies were found to be associated with increases in air pollutants, chiefly cardiovascular, cerebrovascular and respiratory pathologies as well as skin and cutaneous allergies. Therefore, based on the existing literature and associations, several disease categories are described relying on the International Classification of Diseases 10th edition (ICD10) [28]. Furthermore, to better communicate the data, disease codes are selected to reflect the various categories. Respiratory diseases, for example, are represented by their ICD10 code range of J00—J99.
Not only have general disease categories been found to be associated with rises in air pollutant levels but also specific pathologies such as asthma, bronchitis, emphysema, urticaria, etc. To better represent these pathologies, their codes were used instead of using the broader disease category and code range. For example, bronchitis, asthma and emphysema represent obstructive respiratory diseases, a subset of the more general respiratory disease category. To represent obstructive pathologies, the code range J40—J47 is used rather than J00—J99 which is the general code range for all respiratory pathologies.
Entries were selected for inclusion based on their chief complaints. For the purposes of the study, a chief complaint was taken to mean a statement the patient reports to describe his symptoms, condition or reason to seek medical attention. Ultimately, a chief complaint is nonspecific and may or may not be reflective of the final diagnosis. Furthermore, since various pathologies may have common symptoms, the same chief complaint may be reflective of more than one distinct pathologic entity. Chest pain, for example, is a nonspecific complaint that may be present in cardiac, respiratory, gastrointestinal, musculoskeletal or psychiatric pathology. Given the non-specific nature of chief complaints, several entries could erroneously be deemed eligible. To represent these erroneously collected entries, the algorithm was developed to include error codes and categories.
Tables 3 and 4 summarize the various disease categories and codes used in this study.
Statistical Analysis
Descriptive analysis of the dataset was performed using Microsoft Excel. Counts were established for each variable and bar graphs plotted. Several graphs were also plotted to demonstrate certain variables versus others for subsequent analysis of possible correlations between the variables. Lastly, code and category counts were performed by day, month, season and year to demonstrate the data over time for future analysis.