Constitutional multicenter bank linked to Sasang constitutional phenotypic data

Background Biobanks are more important in medical area because they can give researchers data for demonstrating and validating their research. In this study, we developed a biobank called the Korea Constitutional Multicenter Bank (KCMB) based on Sasang Constitutional Medicine (SCM). The aim of the KCMB was a foundation to providing the scientific basis of SCM. Methods The KCMB has been constructed since 2006 in 24 Korean medical clinics with collection of questionnaire data, physical measurements and biological information comprised the results from blood test and DNA analyses. All participants were prescribed Sasang Constitution (SC)-specific herbal remedies for the treatment, and showed improvement of original symptoms as confirmed by Korean medicine doctor. Collected data went through de-identification process using the electronic case report form system. For calculation of several SC type specific tendencies, we used the direct standardization and Chi-square tests. Results The KCMB collected clinical information from 3,711 study participants (1,353 men and 2,358 women) aged more than 10 years. The mean age (± standard deviation) was 47.1 (±16.6) and 47.7 (±15.8) years for men and women respectively. After applying the direct standardization, the estimated constitutional distributions for the SC types were as follows: 39.2% for Tae-eumin(TE), 27.1% for Soeumin(SE), 33.7% for Soyangyin(SY), and non-zero but below 0.1% for Taeyangyin(TY). The estimated distribution of TE was about 10% less, while that of SY and SE were slightly more than the distribution reported by Jema Lee established the SCM. Based on the participants’ medical history within the KCMB, each SC type had notably different frequencies for some diseases such as hypertension, diabetes, hyperlipidemia, stroke, and obesity (P < 0.001). Conclusions The KCMB may serve to verify and validate SCM theories and practices. It may also provide new insights into SCM mechanisms. The results from many studies using the KCMB data are of great importance and value for making decisions in healthcare policy and developing novel therapies.


Background
Many researches have studied methods and theories for preventing and diagnosing diseases, and identifying individual differences for personalized medicine. With these efforts, significant progresses in clinical medicine have been made until now. Biobanks played a vital role here, because they provide the biological and clinical information required for research to identify useful biomarkers [1].
The UK biobank was established and hosted by the University of Manchester and supported by the National Health Service (NHS) [2,3]. It aims to build a major resource that can support a diverse range of research intended to improve the prevention, diagnosis, and treatment of illness and the promotion of health throughout the society. The China Kadoorie biobank (CKB) is set up to investigate the main genetic and environmental causes of common chronic diseases in the Chinese population [4,5]. In the CKB, 512,891 adults aged 30-79 years were recruited, of which 41% were men, 56% were from rural areas, and mean age was 52 years [4]. The Framingham Heart Study was established under the direction of the National Heart, Lung, and Blood Institute (NHLBI) in 1948 [6,7], and until 2012, 2,473 articles were published in peer-reviewed medical journals based on Framingham Heart Study data. These efforts enabled the basis of the western medicine to have been accumulated. On the other hand, it is only recently that these biobanks have gained interest in oriental medicine.
In Korea, Sasang Constitutional Medicine (SCM) is one of the Korea's unique traditional medicines [8][9][10][11]. SCM classifies people into 4 sasang constitution (SC) types, namely Taeyangyin (TY), Tae-eumin (TE), Soyangyin (SY), and Soeumin (SE). These classifications are based on the characteristics of an individual's physiology, psychology, and physical attributes. The SC type of a person is thought to determine his/her response to different herbal remedies [8]. The medicinal herbs used in SCM are similar to those used in traditional China medicine (TCM), but the basic principles underlying the choice of treatment and prescription of these remedies are completely different [12]. In SCM, the SC type of the patient is the primary consideration for selecting the medicinal herbs and formulae for treatment. In contrast, TCM medicinal herbs are classified according to the therapeutic effects of the herb itself, namely, dispersive quality, Yin tonifying quality and so forth [13].
Several recent studies have reported correlations between the SC types and an individual's characteristics [14][15][16][17][18][19]. It is thought that knowledge regarding an individual's SC type can be used to better treat disease and improve the quality of health care. Most of these studies have focused on finding different features and their biological mechanism based on SC types [17,18,[20][21][22][23][24][25][26]. One of the important things for these researches is that they should require biological and clinical information based on SCM.
In this paper, we introduce the Korea Constitutional Multicenter Bank (KCMB) based on the previous SCM researches [27][28][29][30][31][32][33][34], and show several distinctive tendencies of diseases with respect to SC types using the biological and clinical data from this biobank.

Data source and subjects
The Korea Constitutional Multicenter Study (KCMS) is an ongoing project designed to establish a database for SCM [35]. Currently, 3,711 participants are enrolled from 24 Korean medical clinics (KMCs) since August 2012. The KCMS was approved by the Institutional Review Board at the Korea Institution of Oriental Medicine (KIOM) (I-0910/02-001). The Korea Constitutional Multicenter Bank (KCMB) is a biobank linked to phenotypic data from KCMS. Figure 1 shows the distribution of hospitals contributing to this study. The hospitals are distributed fairly evenly amongst the different regions of Korea, and the study population represents the entire population of Korea. Figure 1 also shows a concentration of 11 hospitals in a population-dense area where approximately 20% of the Korean population resides.
All participants in the KMCS have had documented responses to herbal medicine as confirmed by Korean medicine doctor (KMD). In order to diagnose the SC type, each participant was prescribed a SC-specific herbal remedy for the treatment of their most prominent physical discomfort [21]. After taking the medicine for 30 days or more, improvement of original symptoms and occurrence of adverse effects were recorded. The SC types were determined only for participants who had an obvious improvement in their chief complaints without experiencing any adverse effects such as indigestion, stomachache, and evacuation troubles. Every hospital recruited participants within their respective patient population. To ensure the accuracy of the diagnoses, practitioners who took part in this study were restricted to those who had more than 5 years of experience in clinical practice. A more detailed description of the verification methodology and procedure of SC types that we used have previously been reported in [35].

The flow of collecting and validation of data in KCMS
We collected and recorded various clinical and biological data. Clinical information was obtained using the Case Report Form (CRF) which is a self-reported questionnaire developed by KIOM to be used for standardization of SCM. Every question on the CRF was designed by SCM experts with reference to Jema Lee's book Donguisusebowon [9]. The CRF consists of 7 parts: (1) general information, (2) external appearance, (3) somatotype, (4) personality, (5) general health condition, (6) symptoms, and (7) reaction to medication. The general information consists of personal information, including gender, age, and marital status. Some sections in the reaction-to-medication part are subjective opinions to be written by the KMD.
The biological information comprised the results from blood test and DNA analyses. The aim of these data was to find genetic and biological characteristics of different SC types. Genomic DNA was isolated from the peripheral blood of participants and was genotyped using the Affymetrix Genome Wide Human SNP array 5.0 [21]. Figure 2 shows the flow chart of the process that the KCMB employed to collect data. The information on every questionnaire was recorded using the KCMB DBMS system by means of an electronic CRF (eCRF) that was inputted by a designated researcher at each hospital. The Clinical Research Associate (CRA) checked and confirmed the accuracy of the data. A data-management expert performed quality-control checks from time to time to ensure accuracy of the recorded data.

Electronic case report form system
The use of eCRF to gather data in clinical and biological researches have grown to progressively replace paper-based CRFs [36]. Many reports show that efficiency for data collection, reporting, query resolution, and validation can be improved by replacing paper-based CRF with electronic ones (eCRF) [37][38][39]. Several requirements for development of the eCRF is recommended by the US Food and drug administration (FDA) [40] and Society for Clinical Data Management [41]. In Korea, the requirements for the eCRF was ordained by Korea FDA [42].
In our study, we established the eCRF which has been developed in compliance with the guidelines [42]. Figure 3 shows the structure of our eCRF system. Our eCRF system supported various client operating systems and Internet browsers, and used a relational database management system to collect and manage clinical and biological data.

Standardization method
We used the direct standardization method to estimate the distribution of SCM [43][44][45]. There are several methods that can be used for the estimation of standardization rates [14,46], and many researchers have used direct and indirect standardization methods [43,[47][48][49]. Standardization allows for comparisons between groups even though there are variations in the number of individuals in each group. Sub-groups are usually defined by age or age and gender. For standardization, the weighted sum (or weighted average) of the sub-group-specific rates in the study population is calculated. Direct standardization calculates a weighted average of the region's age-specific mortality rates where the weights represent the age-specific sizes of the standard population. Indirect standardization uses age-specific mortality rates from the standard population to derive expected distribution in the region's population.
In most circumstances a single easy-to-interpret ratio can be obtained from the direct standardized rate by dividing the expected number in the standard population by the observed number in standard population over the same time period [50]. However, obtaining an easy-tointerpret ratio may be difficult when artificial populations are used because no observations are available. It may be necessary to use standardized rates in these circumstances, and thus standardized rates are used for the purposes of the study.

General characters
Collected resources are consisted of two parts, clinical and biological information in KCMB ( Table 1). The 3,711 clinical information and 3,691 biological information are collected.
Data from 3,711 participants, 1,353 (36.46%) men and 2,358 (63.54%) women, were collected in the KCMS. The mean age (± standard deviation) was 47.1 (±16.6) and 47.7 (±15.8) years for men and women respectively; the ages were not statistically different (P > 0.05). Table 2 shows detailed characteristics of the study population, and

Frequency of diseases by SC
Participants' medical history was included in the KCMB. Table 4 shows the frequency of various diseases, including hypertension, diabetes, hyperlipidemia, stroke, and obesity for each of the SC types. It is notable that the frequencies of these diseases are significantly different between different SC types (P < 0.001, see Table 4 for detail).

Constitution frequency
The results from direct and indirect standardization methods are expected to differ slightly. We used the direct standardization method because our study included the standard population from the "2010 Korean Population and Housing Census" [51], which encompasses all Koreans   and foreigners residing in the territory of the Republic of Korea, as of 0 o'clock on November 1, 2010. This was the 18th population census and 10th housing census. This census was a statistical survey intended to ascertain the demographics of the entire population as well as the number, structure, distribution, and characteristics of households and housings in the Republic of Korea. After applying the direct standardization method, the estimated constitutional distribution amongst all participants was as follows: TE 39.2%, SE 27.1%, and SY 33.7%. Because of its small sample size, we excluded the 76 TY-type subjects completely. The distribution rates for men and women with the TE, SE, and SY constitutions were 43.5%, 24.1%, and 32.4% and 37.3%, 28.7%, and 34.0%, respectively. The rate for TE amongst men was slightly more than that for women, and the rates of SE and SY for men were slightly lesser than that for women (Table 5).

Discussion
The KCMB has been provided various data for the study of the clinical, biological and outcome related to SCM. The KCMB may serve to verify SCM theories in these areas and may provide new insights into SCM mechanism and course.
Through participants' medical history within the KCMB, each constitution has a different frequency based on several diseases. According to previous researches,  abdominal obesity (AO), hypertension and diabetes mellitus were revealed to be associated with a specific SC type [23,52,53]. In the AO case, specifically, the TE type was associated with increased prevalence of AO compared with the SE and SY types in males and females, even after adjusting for the potential variables such as; age, BMI, and several chronic diseases [27]. In addition, the TE type were more afflicted with diabetes mellitus than the SY or SE types [52]. Also, the TE type exhibited highest prevalence of hypertension than any other SC types, and could act as an independent risk factor for hypertension [54]. These results can be seen similarly in the KCMB and are predicted by the KCMB. Through these results, the significantly different frequency of diseases based on SC types may be an important factor considered when making decisions on healthcare policies or developing new drugs, as it could reflect the results of previous SCM studies. There are currently many efforts under way aimed at finding biological understanding about SC type. However, so far there are not many biological results.
In the future, if more biological resources based SCM are collected, we will have more meaningful results. Since Jema Lee established the SCM, most of the studies related to SCM have assumed the distribution of the SC types as the standard distribution described in Donguisusebowon [9]. However, there is no statistical evidence for this assumption, and Jema Lee only briefly referred to the distribution of the SC types. In our study, we used the direct standardization method [44,45] to estimate the distribution of the SC types in Korea based on KCMS participants. Jema Lee described the SC type distribution as follows: TE 50.0%, SE 20.0%, SY 30.0%, and TY being very rare. Our results show the estimated constitutional distribution to be TE 39.2%, SE 27.1%, and SY 33.7% (Table 5). Clearly, our estimated distribution of TE is about 10% less, while that of SY and SE is slightly more than the distribution reported by Jema Lee. The difference seems to be due to several reasons. First of all, although Jema Lee described the distribution of SC several times in his books, each of these descriptions differs slightly, and there are no evidences to substantiate these distribution rates. Second, the current Korean population structure and the population structure at the time of Jema Lee's book differ markedly. During the past century, Korea has undergone rapid social change such as the South-north division, the Korean War, and industrialization. Third, Jema Lee's opinion on the distribution of SC types was not meant to give an absolutely accurate distribution rate, but rather an overview of knowledge that could be applied in clinics and be used for future references. It is thus necessary to accurately estimate the distribution rate of SC types in the current era.

Conclusion
In this paper, we overview the Korea Constitutional Multicenter Bank (KCMB) and showed several tendency based on Sasang Constitutional (SC) types. Recently, increasing number of studies has reported correlations between the SC types and an individual's characteristics. Estimation of the distribution rate of SC using the KCMB could be used to calculate the prevalence of various diseases according to SC types. We expect that more studies might be progressed for providing the scientific advancement of SCM using the KCMB with various data.
The results from these studies are useful and should be considered when making decisions on healthcare policies or developing novel therapies.