Consumption situation of Chinese medicine in Shanghai: a cross-sectional study

Background This cross-sectional study aimed to construct a database of Chinese medicine consumption, including the annual intake, the number of days of intake, and the daily intake of Chinese medicine, which is helpful for risk assessment and understanding of consumer trends. Methods About 40 million rows of data used in this study were derived from the hospitals in Shanghai, which contains the prescription number, the name of consumers, the date of the dispensing, the name of Traditional Chinese Medicines (TCMs), dosage, and the number of days taken. All data were stored in a MySQL database. R language was used as the main tool for statistical analysis and graphical work. Results The result shows the annual consumption, annual consumption days, and average daily consumption of 20 types of common TCMs and all TCMs consumed. The result shows Astragali radix, Coicis semen, and Danshen are the top three of consumption among the selected Chinese medicines. An easy-to-use software called the Chinese medicine consumption database (CMCD) was designed to search and download consumption data. It is built using the Shiny package in the R, is free to access on any device with an internet browser, and requires no programming knowledge to be used. Conclusions A Chinese medicine consumption database was constructed, which included the consumption situation of 20 types of common TCMs and all TCMs consumed. This database plays a pivotal role in the risk assessment of the pollutants in TCMs and the prediction of the consumption trend of TCMs.

Conclusions A Chinese medicine consumption database was constructed, which included the consumption situation of 20 types of common TCMs and all TCMs consumed. This database plays a pivotal role in the risk assessment of the pollutants in TCMs and the prediction of the consumption trend of TCMs.

Background
Traditional Chinese Medicines (TCMs) have been an integral part of Chinese culture and the primary medical treatment for a large portion of the population for thousands of years [1]. Outside of Asia, there has been a growing use of TCMs where they are being used as an input to, or as an alternative to, conventional Western medicine [2]. For example, artemisinin can treat malaria and inhibit the growth of cancer cells [3,4]. Some Chinese medicines could relieve the symptoms of COVID-2019 and neurodegenerative disorders [5,6]. However, TCMs are not risk-free because of undeclared or misidenti ed TCMs ingredients including allergenic substances [7], plant toxins [8], heavy metals such as mercury, lead, copper, and arsenic [9,10], and pharmaceutically active compounds of undetermined concentration [1,11].
Risk assessment of TCMs contaminations can be generally de ned as a structured scienti c process for characterizing the potential hazards and the associated risks to life and health resulting from exposure of humans to chemicals present in TCMs over a speci ed period [12]. Some studies have attempted to calculate the health risks of heavy metals in TCMs. For instance, Wang et al. used the method of hazard quotients (HQ) to evaluate the health risks of heavy metals among 55 types of Chinese medicines [13].
Liu et al. also used HQ to evaluate the potential health risk of arsenic in TCMs [14]. Taken together, these studies all used the maximum ingested TCM doses from the Chinese Pharmacopoeia to calculate the average daily intake dose (ADD), which is an important parameter in risk assessment. However, this calculation method lacks accuracy [15]. The accuracy of risk assessment is not only related to the contaminations in traditional Chinese medicines but also to the consumption data. The availability of detailed and high-quality TCMs consumption data collected at an individual level are essential for assessing the exposure to potential risks [16].
TCM consumption data are essential for the risk assessment or, to be more precise, for consumers' accurate assessment of exposure to harmful substances (e.g. contaminants, pesticides, food additives, migrating compounds) and micro-biological contaminants. On the basis of the results of the risk assessment, the maximum limit of pollutants in TCMs can be controlled or modi ed, furthermore different endangered groups of the population (e.g. children, sensitive groups) can be identi ed.
TCM consumption databases are also indispensable for the understanding of consumption trends and characteristics. Based on the knowledge of the preference of distinctive Chinese medicines, the levels of their production can be reduced or increased [17].

Study design and participants
The consumption data of Chinese medicine come from Shanghai hospitals and pharmacies between January 2019 and December 2019. The people involved in this data were over 18 years old, and their names have been anonymized. Personal information is not contained in publicly disclosed data. In this study, only oral Chinese medicines are involved, excluding external use and acupuncture, etc.
Chinese medicine studied Twenty types of common TCMs included in the present study are summarized in Table 1. Based on their availability, these food plants were identi ed as being commonly used by the local population.

Data preprocessing
The collected data were imported into the MySQL database system for further processing. Imported raw data include prescription number, the name of TCMs consumer, the date of dispensing, the name of TCMs, dosage, and the number of days taken. The main content of data cleaning is as follows. Firstly, abnormal data, such as names of TCMs including "external use", "pharmacy", and "prescription" were removed. Secondly, non-numeric characters were removed in the dosage, such as removing the string of "g". Thirdly, remove the rows with NULL values. The programs used for data cleaning were R and MySQL.

Data analysis and visualization
Program R was used for data analysis and visualization. The content of data analysis mainly includes two parts, one is the analysis of the consumption of a certain common Chinese medicine, and the other is the analysis of the intake of all Chinese medicines. The results obtained are mainly the consumption of Chinese medicines of each Chinese medicine consumer. The main steps for analyzing speci c TCMs are as follows (Fig. 1): (1)

Characteristics of raw data
There are about 40 million rows of raw data after cleaning, which contains the prescription number (PN), Consumer's names (Name), the Consumption date (Date), the name of TCMs (CMN), dosages (DD), and the number of days taken (days). The sample table of raw data is shown in Table 2 and the patient's name in the table is anonymized. 40 million rows of data are stored in the MySQL database for further analysis.

Consumption situation of all TCMs
The results of all TCMs consumption situations include the annual consumption, consumption days, and the average daily consumption of TCMs. The result indicates that a total of 408,250 people consumed TCMs in these Shanghai hospitals and pharmacies in 2019. Table 3 shows averages of annual consumption, consumption days and average daily consumption of all TCMs are 12923 g, 44 d, and 259 g/d, respectively. The distribution of annual consumption of TCMs consumers is shown in Fig. 2. Consumption situation of 20 speci c Chinese medicines Fig. 3 indicates that the number of consumers of Astragali radix, Atractylodis macrocephalae rhizoma and Glycyrrhizae radix et rhizoma is the top three. The number of people who consume these three Chinese medicines exceeds 250,000, which means that more than half of the consumers contain these kinds of Chinese medicines. Averages and quantiles of the annual consumption of 20 speci c Chinese medicines are shown in Fig. 4. The TCM with the highest consumption is Astragali Radix, and its P95 of annual consumption is close to 4000 g/y (Table 4). Depending on the results of data analysis, the average consumption days of each Chinese medicine in 2019 are between 25 and 44 days (Table 5). Table 6 shows the average daily consumption of TCMs during the days of consuming TCMs.

Discussion
TCM has been playing a very important role in health protection and disease control for thousands of years in China. China is one of the world's biggest medical systems, and it is also the world's largest producer and exporter of TCM, with an annual production output of $3.5 billion [18]. However, the increase in global demand has raised concerns for the safety of TCM. Various studies have shown that TCM can be contaminated with mold, pesticides, and heavy metals, in some cases at toxic levels [19]. TCM drug safety monitoring and risk assessment are becoming increasingly crucial tasks for the internationalization of TCM. In this study, a big data platform about TCMs' consumption database was built, which is essential for consumers' accurate assessment of exposure to harmful substances [17].
It's a pity that demographic information such as age, marital status, current height and weight, and education level was not included in the raw data. Compared with food, it is di cult to take a TCMs frequency questionnaire on the intake of speci c TCMs, because most consumers do not know the accurate intake of speci c TCMs. Zuo et al utilized retrospective questionnaires to collect data on the consumption of TCMs, and its results show the P95 of the duration of TCM intake was 90 days per year, which is less than the result of this study. The mean and P95 daily TCM consumption amounts were similar to those of this research [20]. However, they only counted the overall consumption of all TCMs, not the consumption of individual TCM. The result of this study indicates that the consumption of diverse traditional Chinese medicines varies greatly.
In summary, the utilization of the Chinese medicine consumption database is widespread (e.g. estimation of drug intake, risk assessment, and understanding of consumption trends), thus it is a vital task to construct Chinese medicine consumption databases.

Conclusions
This study processed and analyzed the TCM consumption data collected by Shanghai in 2019, and then constructed a Chinese medicine consumption database (CMCD), which can be used for risk assessment of TCMs and understanding TCMs consumption tendencies. The sample collections were conducted under the permission and guidelines of local governments. The informed consent was obtained for experimentation with human subjects and data of all the participants are fully anonymized.

Consent for publication
Not applicable.

Competing interests
The authors declare that they have no known competing nancial interests or personal relationships which have or could be perceived to have in uenced the work reported in this article.

Availability of data and materials
The data sets used during the current study are available from the corresponding author on reasonable request.  Number of consumers of 20 speci c TCMs