The impact of ginsenosides on cognitive deficits in experimental animal studies of Alzheimer’s disease: a systematic review

Background The efficacy of ginsenoside treatment on cognitive decline in individuals with Alzheimer’s disease (AD) has yet to be investigated. In this protocal, we conducted a systematic review to evaluate the effect of ginsenosides on cognitive deficits in experimental rodent AD models. Methods We identified eligible studies by searching seven electronic databases spanning from January 1980 to October 2014. We assessed the study quality, evaluated the efficacy of ginsenoside treatment, and performed a stratified meta-analysis and meta-regression analysis to assess the influence of the study design on ginsenoside efficacy. Results Twelve studies fulfilled our inclusion criteria from a total of 283 publications. The overall methodological quality of these studies was poor. The meta-analysis revealed that ginsenosides have a statistically significant positive effect on cognitive performance in experimental AD models. The stratified analysis revealed that ginsenoside Rg1 had the greatest effect on acquisition and retention memory in AD models. The effect size was significantly higher for both acquisition and retention memory in studies that used female animals compared with male animals. Conclusions We conclude that ginsenosides might reduce cognitive deficits in AD models. However, additional well-designed and well-reported animal studies are needed to inform further clinical investigations.


Background
Alzheimer's disease (AD), the incidence of which is rapidly increasing worldwide, leads to death within 3 to 9 years after diagnosis. It has been estimated that the number of individuals living to 100 years or more will increase by over 200 % and that the number of individuals surviving to 90-95 years will double between 2000 and 2020 [1]. One of the most significant disabilities associated with AD is cognitive impairment [2]. This impairment interferes with work, relationships, leisure, and activities of daily living and exacts a personal and economic cost that is difficult to quantify. In addition, cognitive impairment in AD is associated with a significant social burden and time commitment on the caregiver [3].
Ginseng, the root of Panax ginseng C.A. Meyer, is a traditional medicinal herb that has been applied widely in Asia for the treatment of aging and memory deterioration [10]. Ginsenosides such as ginsenoside Rg, ginsenoside Rb, and ginsenoside Re are the most important pharmacologically active ingredients in ginseng [11]. Ginsenosides are a class of tetracyclic triterpene glycosides (also known as saponins) [12] that are widely used in herbal medicine and have been shown to attenuate cognitive impairment and improve behavioral symptoms in humans [13]. Recent studies have demonstrated that the long-term consumption of ginsenoside Rg1 improves cognitive performance, decreases the levels of Aβ 1-42 protein in the hippocampus of aged senescence-accelerated mouse prone 8 (SAMP8) [14], decreases the accumulation of NFTs [15], increases the extracellular secretion of soluble amyloid precursor protein α (sAPPα), and enhances α-secretase activity [16]. Rb1 protects neurons against Aβ 1-42 -induced neurotoxicity and tau hyperphosphorylation [17]. Furthermore, ginsenoside Rh2 reduces senile plaques and plays an important role in neuronal differentiation [18]. Ginsenoside Rd has been shown to alleviate inflammation and alleviate cognitive deficits [19].
The need to conduct systematic reviews of animal experiments, which can provide evidence for the potential translational value of animal models to human disease, has been highlighted [20]. Systematic reviews and metaanalyses of animal studies contribute to the modeling of clinically relevant problems; in particular, such reviews allow decisions regarding the design and conduct of subsequent clinical trials to be based on all existing evidence synthesized in an unbiased manner. Moreover, systematic reviews permit a more objective appraisal of evidence than is allowed by the traditional narrativestyle reviews that are more commonly associated with animal research [21]. However, there are few systematic randomized trials and observational studies that have explored the effect of ginsenosides on cognition in humans with AD. Therefore, a meta-analysis was conducted to evaluate the efficacy of ginsenosides for the treatment of cognitive impairment in experimental animal models of AD and explore the impact of the study design and quality on the reported outcomes.

Literature search
In October 2014, seven electronic databases (PubMed, Web of Science, MEDLINE, Embase, Google Scholar, CNKI, and Wanfang) were searched using the terms "Alzheimer's disease" (or "dementia", "Alzheimer disease", "Alzheimer", "Alzheimer's", "Alzheimers") and "ginsenoside" (or "ginseng, "ginseng saponin"). All the searches were limited to literature published between January 1980 and October 2014. This systematic review was limited to the results of animal studies. The reference lists from relevant publications were used to identify further relevant research articles and reviews. Each study had to meet several inclusion criteria (Table 1) to be included in this meta-analysis. To determine their eligibility for inclusion, two investigators (Chenxia Sheng and Weijun Peng) assessed the titles and abstracts of identified articles, and obtained copies of the articles to review the study design and methodology for studies that administered ginsenoside and measured cognitive, behavioral, and motor problems in AD model rodents. Disagreements among the investigators were resolved by consensus following a discussion.

Data extraction
Two investigators extracted information about the studies, including the species; sample size; type of AD model; main experimental groups; type of anesthetic agent; substances used as experimental and control treatments; the dose, method, and timing of ginsenoside administration; and the time of the outcome assessment.
The Morris water maze (MWM) was used to assess cognition in all the studies included in the analysis. If cognition was assessed several times in the study, then only the final assessment was included in the analysis. For cases in which the data were expressed graphically, the investigators attempted to obtain numerical values from the study authors; if these values were not available, then digital ruler software was used to estimate the numerical values from the graphs. For cases in which the data were missing, the investigators contacted the authors and requested the additional information. If the required data were not available, then the study was excluded from the analysis. If one study examined different animal models of AD or ginsenoside doses, then these models or doses were analyzed as separate studies.

Methodological quality
The methodological quality of the studies was assessed based on a checklist modified from the Collaborative Approach to Meta-Analysis and Review of Animal Data (1) Ginsenoside were administered.
(3) AD treatment group was treated with a pharmacological agent, and a control group was administered a placebo after injury.
(3) Treatment group was administered another neuroprotective agent in addition to ginsenoside. from Experimental Studies (CAMARADES), as previously described [22] with minor modifications. The modified CAMARADES included randomization of subjects into treatment groups instead of blinded induction of ischemia (allocation concealment) [23]. One point was tallied for written evidence of each of the following criteria: peer-reviewed publication; randomization of subjects into treatment groups; assessment of dose-response relationship; blind assessment of behavioral outcomes; monitoring of physiological parameters, such as body temperature; calculation of the sample size necessary to achieve sufficient power; statement of compliance with animal welfare regulations; avoidance of anesthetic agents with marked intrinsic neuroprotective properties (e.g., ketamine); statement of potential conflicts of interest; and use of a suitable animal model ( Table 2). The study quality also was assessed with secondary criteria as previously described [24] (Table 3). These criteria included study characteristics such as the age, species, and sex of the animals used; the duration of supplementation; and the dose(s) of ginsenoside. These criteria also included an assessment of the internal validity of the study, i.e., performance bias (differences in care provided?); exclusion bias (differences in withdrawal from studies?); detection bias (differences in outcome measurements?); and selection bias (differences in allocation to comparison groups?), as well as an assessment of the external validity of the population, intervention, and outcome.

Statistical analysis
All statistical analyses were conducted in accordance with the Cochrane Handbook for Systematic Reviews of Interventions. For each outcome measure in each study, the standardized mean difference (SMD; equal to the difference in the mean outcome between the groups divided by the standard deviation of the outcomes among the participants, which was reported in units of standard deviation) was calculated, which allows data measured on different scales to be merged. Despite the anticipated heterogeneity, the individual SMDs were pooled whenever possible to obtain an overall SMD and 95 % confidence intervals (CI).
Within-and between-study heterogeneity or variation was assessed using Cochran's Q-statistic. A significant Q-statistic (i.e., p < 0.10) indicated heterogeneity among studies. Heterogeneity also was assessed using the metric of Cochran's Q-statistic, I 2 values of 25, 50 and 75 % correspond to low, medium and high levels of heterogeneity, respectively; values ≤ 50 % indicated an acceptable degree of heterogeneity between studies [25].
The presence of small-study effects was investigated with funnel plots and Egger tests. A p < 0.10 on the Egger test indicated the presence of small-study effects. For studies comparing different doses or timing of drug administration to a single control group, the data from all experimental groups were pooled to compare with the control group. The pooled effect size was estimated using fixed-and random-effects models. When there was heterogeneity among studies, the pooled effect size was estimated using a randomeffects model.
Both biological and methodological characteristics were examined in an attempt to explain the possible causes of heterogeneity (i.e., the possible causes modifying the outcome) among the studies. Stratified analyses were performed with experiments grouped according to the following characteristics: species and sex of the animals; anesthetic method; type and dose of ginsenoside; study quality; and the route of drug delivery. The difference between the groups was assessed by partitioning the heterogeneity and using the X 2 distribution with n-1 degrees of freedom (df), where n equals the number of groups. To adjust the significance levels for multiple comparisons, we used a Bonferroni correction [26] [declared significance = 1 − (1 − denoted significance)^(1/number of comparisons)], yielding critical p-values of 0.00215 for the acquisition memory and 0.00394 for the retention memory.
Finally, the impact of several variables (i.e., species and sex of the animals; anesthetic method; type and dose of ginsenoside; study quality; and the route of drug delivery) on the efficacy of ginsenoside was assessed using meta-regression when substantial or considerable heterogeneity existed.
All statistical analyses were performed using Stata software (version 13.0, College Station, Texas, USA) and Review Manager (version 5.3).

Methodological quality of studies
A large variety of tools to assess the quality of animal studies is currently used, but none of these tools focus on internal validity only. Most instruments assess the reporting quality and internal and external validity simultaneously even though the consequences associated with poor reporting, the risk of bias, or the generalizability of the results are different [39].
Therefore, two tools were used to assess the quality of each study using the CAMARADES checklist. Overall, the median quality score for the 12 included studies was poor (5.29; interquartile range: 5-6), with scores ranging from 4 to 8. No study received a score of 0 or 10. Three studies received scores that indicated high quality [27,28,33], and these 3 studies reported the monitoring of physiological parameters during surgical procedures.
One study [34] did not report the randomization of animals into treatment groups. Three studies [31,32,35,36] assessed dose-response relationships. Three studies [30,31,33] stated no potential conflict of interests. Only one study [28] contained a statement that outcome measures were assessed by experimenters who were blind to the treatment condition. Moreover, no study described the sample size calculation to confirm that sufficient power had been achieved. The median quality score indicated that 17.5 out of 21 of the secondary Are the characteristics of study population clear?
Presence and correct control group?
Where the groups similar at baseline (if not randomized think of weight and sex etc.)?
Age when supplementation started mentioned?

Duration of supplementation clear and specified?
Is the timing of the supplementation during the day specified and similar in both groups?
Methods used for outcome assessment the same in both groups?
Did report animals who died or were otherwise removed from the study Was the outcome assessment randomized across the groups? criteria had been reported. The lowest score was 15 items (16.67 %) and the highest score was 19 items (8.33 %). Although treatment blinding and a description of the number of animals that died or were otherwise removed from the study are key measures for assessing the quality of studies, no study reported this information. One study [34] reported that investigators were blind to the treatment condition during the outcome assessment, and none of the papers described randomizing the order of the outcome assessments across the groups.

Stratified meta-analysis
Subgroup analyses were conducted to assess the degree to which methodological differences between trials might have systematically influenced the differences observed in the primary treatment outcomes. The overall summary of each subgroup then can be inspected for evidence of variation in the effects of the intervention, which would suggest that the stratifying characteristic is an important source of heterogeneity and may modify the treatment efficacy. Current guidelines recommend at least 10 studies per characteristic for stratifying subgroups [40]. The results of the stratified analyses are described in Table 5. First, the protective effects of ginsenoside Rb, ginsenoside Rg1, ginsenoside Rg2, ginsenoside Rg3, and ginsenoside Rg5 administration on cognitive performance were examined. Rg1 treatment had a significantly greater beneficial effect on acquisition memory and retention memory (X 2 = 29.55, df = 4, p < 0.00001, I 2 = 86.5 % and X 2 = 6.15, df = 1, p = 0.001, I 2 = 83.7 %, respectively) compared with Rb treatment, Rg2 treatment, Rg3 treatment, or Rg5 treatment. Next, the efficacy of different doses of ginsenosides on cognitive performance was analyzed. For both acquisition and retention memory, significant beneficial effects were noted for all doses of ginsenosides. The protective effects of 30 mg or higher doses on acquisition memory were examined to determine whether the effects of higher doses are greater than those of lower doses. A 30 mg or higher dose was associated with a greater beneficial effect than a dose less than 30 mg on acquisition memory; however, no significant differences among doses were detected (X 2 = 11.64, df = 3, p = 0.009, I 2 = 74.2 %). The protective effects at 10 mg or higher doses on retention memory also were examined to determine whether the effects of higher doses are greater than lower doses, and a significant effect was found. A 10 mg dose was associated with a significantly greater beneficial outcome compared with a dose less than 10 mg or more than 10 mg for retention memory (X 2 = 12.16, df = 2, p < 0.002, I 2 = 83.5 %; Fig. 3).
The protective effects of species and sex also were examined. For acquisition memory, although the effect size was higher in studies that used rat models, no significant difference was detected between studies that used rat models or mouse models (X 2 = 6.97, df = 1, p = 0.008, I 2 = 85.7 %). However, for retention memory, the effect size was significantly higher in studies that used rat models compared with mouse models (X 2 = 5.58, df = 1, p = 0.002, I 2 = 82.1 %). For both acquisition memory and retention memory, the effect size was significantly higher in studies that used female animals compared with male animals (X 2 = 16.72, df = 3, p < 0.0008, I 2 = 82.5 % and X 2 = 7.25, df = 1, p = 0.003, I 2 = 72.4 %, respectively; Fig. 4). The effect of the anesthetic agent and the route of drug delivery also were examined. For acquisition memory and retention memory, although the effect size was higher in studies that used chloral hydrate anesthesia, no significant differences among anesthetic agents were detected (X 2 = 7.38, df = 3, p = 0.006, I 2 = 53.6 % and X 2 = 0.03, df = 1, p = 0.87, I 2 = 0 %, respectively; Fig. 5). For acquisition memory, although the effect size was higher in studies that used intraperitoneal injection, no significant differences in the routes of drug delivery were detected (X 2 = 0.17, df = 1, p = 0.68, I 2 = 0 %; Fig. 5c). Oral gavage was associated with a significantly greater beneficial outcome than intraperitoneal injection for retention memory (X 2 = 9.83, df = 1, p = 0.002, and I 2 = 89.8 %; Fig. 5d).
The effect sizes for acquisition and retention memory also were examined relative to the study quality score. Significant differences in effect sizes were observed between lower-scoring and higher-scoring studies for both acquisition (X 2 = 25.03, df = 4, p < 0.00001, I 2 = 84.3 %) and retention (X 2 = 29.03, df = 4, p < 0.00001, I 2 = 86.2 %) memory. The effect size for acquisition memory was maximal for studies with a quality score of 4 (−4.05,    Fig. 6a), and the effect size for retention memory was higher for studies with a quality score of 8 (5.98, 95 % CI: −0.63 to 12.59; Fig. 6b) than those with lower scores.

Meta-regression analyses
Meta-regression is an extension of the subgroup analysis that allows for the investigation of the effect of multiple factors simultaneously. The outcome variable is the effect estimate, and the explanatory variables are the study characteristics that might influence the effect size, which are often called the "potential effect modifiers" or covariates.
To further explore the heterogeneity among studies, meta-regression was conducted for the acquisition and retention memory results. For retention memory, the species and sex of the animals, anesthetic method, type and dose of ginsenoside, study quality score, and route of drug delivery explained 49.03 % of the heterogeneity. For acquisition memory, heterogeneity was independent of these factors.

Publication bias
Finally, the presence of small-study effects, which may contribute to publication bias, were identified. Funnel plots showed an asymmetry for both the acquisition (Fig. 7a) and retention memory (Fig. 7b) data, which provides evidence for small-study effects (Egger regression, p < 0.0001 and p < 0.001, respectively).

Possible drug protection mechanism analysis
All studies that were included in the analysis assessed the biological mechanisms of ginsenoside activity. Across studies, the neuroprotective effect of ginsenoside was attributed primarily to anti-inflammatory activity [28,32,34]. Ginsenoside was reported to promote the non-amyloidgenic cleavage of beta-amyloid precursor protein (APP) [18], attenuate Aβ formation [29], decrease Aβ levels, attenuate hippocampal histopathological abnormalities [30], prevent tau hyperphosphorylation via the regulation of p-GSK3 and serine/threonine-specific protein phosphatase 2A levels [31], activate the endoplasmic reticulum signaling pathway, inhibit the activity of acetylcholinesterase [37], and upregulate the expression of nerve growth factor [36].

Discussion
Many animal experiments are performed to inform human health, and may play an important role in the identification and development of drugs, medical devices, and surgical procedures; risk assessments for safe human exposures; and increasing biological knowledge. It would seem rational to critically review the existing relevant animal experiments before new animal experiments and, in particular, clinical trials in humans are performed. Systematic reviews and meta-analyses are suitable tools to summarize the current evidence on a given subject, and therefore directly support the 'three Rs' (i.e., replacement, reduction, and refinement) by, for example, preventing the unnecessary duplication of animal studies. Systematic reviews and meta-analyses play an important role in physics, the social sciences, and medicine [41].
The results of this systematic review and meta-analysis show that ginsenoside provides neuroprotective effects in terms of improving cognitive outcomes in AD. Ginsenoside Rg1 exhibited the highest protective effect on both acquisition and retention memory. The species and sex of the animals, the type and dose of ginsenoside, and the study quality all had significant impacts on the effect size. In contrast, the route of drug delivery and the anesthetic method had no significant effects on the outcomes. Our analysis also suggests that some aspects of the original study design had an impact on the study outcome. First, the effect size was higher in rat studies than in mouse studies, which suggests that different species may react differently to ginsenoside. Second, the route of administration and dosage of ginsenoside also affected the outcome [42]. The protective effect on acquisition memory was better with doses of 30 mg or higher (although not significantly) and the protective effect on retention memory was better with doses of 10 mg. These results are not consistent with the doselinear response curve described previously, in which higher doses yielded a greater response [43]. The effect size probably was overstated in studies that administered lower doses.
The methodological quality of the studies was assessed according to standards previously described for the preclinical development of neuroprotective drugs with minor modifications [22,24]. In general, the quality of the included studies was poor. For example, only one study reported that investigators were blind to the treatment condition during the behavioral assessments. Treatment blinding is recommended in open-label trials to reduce bias. If patients, clinicians, or assessors are aware of the treatment assignment, then this knowledge may influence the reporting or measurement of the outcome and introduce bias [44]. Moreover, all the studies failed to report the calculation of the sample size necessary to achieve sufficient power, which is crucial to judge the efficacy of a new therapy or drug [45]. Unfortunately, the reporting of sample size elements specific to these random trials remains below that necessary for transparent reporting. The authors should calculate the sample size during the planning phase of the study to assess the accuracy of the a priori estimates and aid the design of future trials. In addition, journal editors and peer reviewers should implement stricter requirements for authors to follow CONSORT recommendations [46]. All studies failed to report animals that died or were otherwise removed from the study. As these events may have been due to medication side effects, this information is important in assessing the use of ginsenoside.
Systematic reviews can use research data from numerous study designs. However, when conducting systematic reviews of interventions, studies conducted using low evidentiary designs for evaluating real-world efficacy are generally not appropriate for inclusion [47]. In our systematic review, high quality studies showed a trend toward better acquisition memory outcomes, but studies with lower quality scores exhibited the highest protective effect on retention memory. The effect size was probably overstated in studies with lower quality scores.
Our study has several limitations. First, although we conducted a thorough literature search, we did not conduct a search of older data that has not been indexed in an electronic/online database. Our analysis is based only on published data or academic dissertation data available online, most of which showed positive results; therefore, our study may have missed 'negative' results. In addition, positive results, which are easier to publish, often appear in journals with higher impact factors than negative results [48]. The funnel plots and Egger tests suggest the possibility of a publication bias or other small-study bias, which is consistent with observations from other systematic reviews of animal studies [49]. Publication bias, which is considered a potential threat to the validity of all systematic reviews that include experimental studies, may have led to an overestimation of the protective effect of ginsenoside in our study. Second, we focused only on the effect of ginsenoside on cognitive deficits in AD. We did not conduct analyses to investigate the effect on histopathology, such as plaques and tangles, due to insufficient data. Third, due to language barriers, we only searched databases for articles published in English or Chinese, and did not search for studies published in other languages such as Korean. Korea is one of the main countries that uses ginseng, and therefore we may have missed some relevant publications. In addition, the animals used in the included studies were young rodents, which is not consistent with the average age of humans in the relevant clinical setting of AD. There remain some unknown factors that contributed to the heterogeneity of the effect size in our study. The number of preclinical experiments performed each year continues to increase, and our understanding of the disease mechanism is improving. However, the number of novel interventions that reach the clinic to treat cerebrovascular diseases continues to fall due to limitations in the translational paradigm [50]. The standardization of animal protocols and the systematic review of animal models that do not currently qualify as predictive modalities for human responses to drugs and disease are supported by experts in various fields of science [51]. Therefore, these limited results may not be adequate for the transition from animal experiments to human clinical trials. Consequently, prior to making any clinical practice recommendations, high methodological reporting and quality control experimental studies are needed to better evaluate the impact of this promising pharmacological intervention for AD.

Conclusion
This systematic review and meta-analysis indicates that treatment with ginsenoside can alleviate cognitive deficits in experimental animal models of AD. Although some factors, such as the study quality and a potential publication bias, may undermine the validity of these positive findings, ginsenosides may play a potential neuroprotective role in AD. However, without rigorous, robust, and detailed preclinical evaluations, novel neuroprotective drugs may prove to be ineffective when tested in large, time-consuming, and expensive human clinical trials. Therefore, additional welldesigned and well-reported experimental animal studies are needed.