CamMedNP: Building the Cameroonian 3D structural natural products database for virtual screening
© Ntie-Kang et al.; licensee BioMed Central Ltd. 2013
Received: 19 January 2013
Accepted: 10 April 2013
Published: 16 April 2013
Computer-aided drug design (CADD) often involves virtual screening (VS) of large compound datasets and the availability of such is vital for drug discovery protocols. We present CamMedNP - a new database beginning with more than 2,500 compounds of natural origin, along with some of their derivatives which were obtained through hemisynthesis. These are pure compounds which have been previously isolated and characterized using modern spectroscopic methods and published by several research teams spread across Cameroon.
In the present study, 224 distinct medicinal plant species belonging to 55 plant families from the Cameroonian flora have been considered. About 80 % of these have been previously published and/or referenced in internationally recognized journals. For each compound, the optimized 3D structure, drug-like properties, plant source, collection site and currently known biological activities are given, as well as literature references. We have evaluated the “drug-likeness” of this database using Lipinski’s “Rule of Five”. A diversity analysis has been carried out in comparison with the ChemBridge diverse database.
CamMedNP could be highly useful for database screening and natural product lead generation programs.
Keywords3D structures, Database collection Natural products Medicinal plants Virtual screening
For more than 4 millennia, plants have been used as a source of medication. According to the World Health Organization (WHO), phytomedicine is a part of health care systems around the world , and its importance is underscored by the fact that by 1990 about 80% of drugs were either natural products (NPs) or analogues inspired by them [2–4]. Moreover, large proportions of natural products are biologically active and have favourable ADME/T (absorption, distribution, metabolism, excretion, and toxicology) properties, despite the fact that they often do not satisfy proposed “drug-likeness” criteria . Thus, modern drug discovery programs often resort to natural sources to guide the careful design of “drug-like” leads from suitable scaffolds, often by synthetic modifications of the latter [6, 7]. Nowadays, employing computer-aided drug design (CADD) methods, which often incorporate the virtual screening (VS) of large compound databases against validated drug targets followed by the careful selection of virtual hit compounds to be screened by biological assays, has become a very important part of the drug discovery process. This strategy considerably narrows down the number of compounds that undergo biological screening and hence drastically cuts down the cost of discovery of a drug [7–10]. The adoption of this drug discovery strategy has therefore necessitated the development of databases of virtual compounds. In addition to the increasing number of commercial natural compound suppliers , the past decade has seen the development and publication of a number of NP compound databases: The SuperNatural database ; The Chinese traditional medicinal herbs database ; Marine natural products databases [14, 15]; The NAPROC-13 database ; a database for the predicted pharmacophoric features of medicinal compounds isolated from medicinal plants in India [17, 18]; and the PHARM database, based on Thai medicinal plants .
The fact that the African flora, and the Congo Basin in particular, holds enormous potential as a source of drugs for its poverty-stricken populations and the world at large cannot be overemphasized [20–22]. Located in the Congo Basin, Cameroon has a rich rain forest and most of her rural populations have depended on medicinal plants for the treatment of a number of tropical diseases until now [21, 22]. Interest in these plants led to the creation of the Department of Organic Chemistry at the University of Yaoundé (now Yaoundé I, UY), whose NP research groups have been thereafter actively involved in the isolation and characterization of active principles from medicinal plants that could serve as drug leads. Moreover the research teams born in UY have served as the nursery for the training of more than 90% of the current leaders of the various NP research groups spread throughout the country. For more than four decades, Cameroonian research groups have been actively involved in the extraction, purification and characterization of biologically active compounds from medicinal plants. The result has been the steady increase in the volume of scientific publications annually. It must, however, be noted that only the most promising compounds are usually published in internationally recognized peer reviewed journals. The seemingly uninteresting ones are only reported in local journals as well as in MSc and PhD theses. The locally published and unpublished data, nevertheless, constitute an enormous wealth of knowledge that has remained unavailable to the wider scientific community. To the best of our knowledge, a searchable 3D compound database of pure compounds from Cameroonian medicinal plants has not been previously reported. Even though the chemical structures of about 80% of the compounds in CamMedNP are published in journal articles, the presence of 3D structures makes the present database valuable for molecular modelling groups carrying out VS and CADD. Moreover, little effort has been made locally to develop the expertise (in areas such as medicinal chemistry), which is required to mount credible drug development efforts . It is therefore of value to present a comprehensive data review, based on the published and unpublished results of the various research groups. The goal has been to prepare a database containing 3D structures as well as the physico-chemical properties, geographical distribution of the plant species and the known biological activities of these compounds. In this paper, we present CamMedNP, a new database of 3D chemical structures, available in several file formats (.mdb, .ldb, .mol2, .sdf), which are readable using several drug discovery software tools. Thus, CamMedNP could be used by research groups involved in CADD to carry out protein-ligand docking, pharmacophore mining, substructure searching and VS against validated drug targets. Since these plants have been used traditionally in the treatment of several medical disorders, the aim of VS would be to identify suitable compound scaffolds which could be subjected to further investigation in the search for lead compounds for the treatment of these and other diseases. An assessment of the “drug-likeness” of the CamMedNP database in comparison with the Dictionary of Natural Products (DNP) is also reported here, as well as a diversity analysis, in comparison with the ChemBridge diverse database.
Construction and content
The plant sources, geographical collection sites, chemical structures of pure compounds as well as their spectroscopic data, were retrieved from literature sources comprising of 25 PhD and Habilitation theses from the libraries of the University of Yaoundé I (UY), University of Douala (UD), University of Dschang (UDs) and the University of Buea (UB), all in Cameroon. International peer reviewed journal sources include 48 journals, with references ranging from 1982 to 2012. This constitutes a total of 364 journal references, as well as 4 unpublished conference presentations (from personal communication with the authors). The full list of journals consulted is given in the supplementary material (Additional file 1).
Generation of 3D Models, Optimization and Calculation of Molecular Descriptors
Based on the known chemical structures of the NPs, all 3D molecular structures were generated using the graphical user interface (GUI) of the MOE software  running on a Linux workstation with a 3.5GHz Intel Core2 Duo processor. The 3D structures were generated using the builder module of MOE and energy minimization was subsequently carried out using the MMFF94 force field  until a gradient of 0.01 kcal/mol was reached. The 3D structures of the compounds were then saved as .mol2 files subsequently included into a MOE database (.mdb) file and converted to other file formats (.sdf, .mol, .mol2 and .ldb), which are suitable for use in several virtual screening workflow protocols. The molar weight (MW), number of rotatable bonds (NRB), lipophilicity parameter (log P), number of hydrogen bond acceptors (HBA), number of hydrogen bond donors (HBD), number of Lipinski violations, total polar surface area (TPSA), number of nitrogens (NN), number of oxygens (NO), number of chiral centres (NCC), and number of rings (NR) were calculated using the molecular descriptor calculator included in the QuSAR module of the MOE package . The ChemBridge Diverset database (48,651 compounds) was downloaded from the official ChemBridge webpage . It is noteworthy that the provided 3D structures are those published in the literature, based on NMR and other spectroscopic techniques. Standard software programs which implement typical virtual screening workflows usually involve a preliminary treatment of input ligand structures by tautomer generation and correct protonation at physiological pH. It is the user’s responsibility to implement this preliminary step during the virtual screening protocol.
Plant families and their chemo-taxonomic classification
Secondary metabolites isolated and their known biological activities
The reported biological activities of CamMedNP have also been included in our database. From our study, it was observed that even though the biological activities of 54.9% of the compounds have not been determined, the remaining compounds show a wide range of reported activities. Among the known biological activities are unspecific classifications like antimalarial, antileishmanial, antitubercular, antitrypanosomal, anti-HIV, antiinflamatory and analgesic, antioxidant, free radical scavenging, antiproliferative, cytotoxicity, erythrocyte susceptibility, spasmogenic, antidiabetic, herbicidal, hepatoprotective, cardiovascular, immunoinhibition, immunomodulatory, antisalmonellal, vasodilator, vasorelaxant and hypertensive effects and activity against Onchocerca gutturosa, while very specific descriptions like inhibition or modulation of known drug targets include: α-glucosidase inhibition, butyrylcholinesterase inhibition, urease inhibition, inhibition of phosphodiesterase I and xanthine oxidase, monoamine oxidase inhibition, prolyl endopeptidase and thrombin inhibition, antitumor, 11β-hydroxysteroid dehydrogenase inhibition, cholinesterase inhibition, prolyl endopeptidase I inhibition, inhibition of neuraminidases from Clostridium perfringens and Vibrio cholerae, snake venom phosphodiesterase I inhibition, inhibition of Phospholipase C, β-D-glucosidase, β-glucuronidase and α-D-Mannosidase inhibition, etc., cytotoxicity against Mucor miehei and Artemia salina, algicidal activity against Chlorella fusca, etc. against Bacillus subtilis ATCC 6633, activity against gram-positive Bacillus megaterium, etc., cytotoxic activity against the HT-29 and HCT 116 human colon cancer cell lines, against colorectal human cancer cells, against human promyelocytic leukemia (HL-60), human hepato-cellular carcinoma and against the human Caucasian prostate adenocarcinoma cell line PC-3, enhancement of cAMP-regulated chloride conductance of cells expressing CFTRΔF508, human neutrophil respiratory burst inhibition, and estrogenic activities. For the majority of cases, antiplasmodial activity was measured by inhibition of the chloroquine-resistant W2 P. falciparum strain with IC50 < 5 μM. Cytotoxicity measurements were carried out in potato disk tumor induction assay and some compounds showed interesting inhibitory properties against human DU-145 and hepatocarcinoma Hep G2 cells with >70% inhibition at 50 μg/mL. Anti-salmonellal assays were measured by minimum inhibitory concentration (MIC) and minimum bactericidal concentration (MBC) values of respectively in μg/mL against Salmonella typhi, S. paratyphi A and S. paratyphi B, and most compounds showed < MBC values of 100 μg/mL. Full details of the other assays could be obtained by consulting the cited references within the database.
Utility and discussion
Discussion of Lipinski’ “drug-likeness” criteria
Comparison with the dictionary of natural products
This improved profile for MW is exactly what is desirable for a more “drug-like” library, according to Lipinski’s criteria. The proportions of the two databases that satisfy Lipinski’s MW property (<500 Da) were 73% for DNP and 78% for CamMedNP. The distribution maxima for calculated log P (Figure 5B) were similar for both databases, with CamMedNP appearing between log P values of 3–4 and DNP giving a value between 2 and 3. A similar trend was observed for the MW distribution. This showed an enhancement of 11.2% for MW values between 301 and 500 Da of CamMedNP over the DNP and a corresponding 13.1% enhancement for log P values between 2 and 5 units. For HBA and HBD respectively (Figures 5C-D), CamMedNP showed improvements of 18.7% for 3 < HBA < 8 and 10.3 % for 0 < HBD < 4 over the DNP. The peak of the distribution for the HBA for the CamMedNP is at 5 acceptors (18.5%) with a significant increase in 6 or 7 acceptors when compared to the DNP (Figure 5C). Similarly, the peak of the distribution for the HBD for the CamMedNP is at 2 acceptors (24.5%) with a significant increase in 1 or 2 donors as compared to the DNP (Figure 5D). The overall summary of the four Lipinski parameters for the two databases, thus reveals that the CamMedNP library is more “drug-like” than the DNP. This is an indication that the chances of finding “lead-like” molecules with improved DMPK properties within a library such as CamMedNP are quite significant. The descriptors useful in ADMET prediction will also be included in the searchable version of the CamMedNP database.
Usefulness of the CamMedNP library
The usefulness of the CamMedNP database in lead generation has been exemplified with the docking and pharmacophore-based screening for potential inhibitors of a validated anti-malarial drug target in our laboratory, and the results will be published in a subsequent paper. CamMedNP is constantly being updated; meanwhile a computer program to facilitate the searching of this database is under development and will also be published subsequently. However, 3D structures of the compounds, as well as their physico-chemical properties that were used to evaluate “drug-likeness”, can be freely downloaded as a supplementary file accompanying this publication. In addition, information about compound sample availability can be obtained on request from the authors of this paper or from the pan-African Natural Products Library (p-ANAPL) project [31, 32].
Virtual screening workflows usually involve docking a compound library into the binding site of a target receptor and using scoring functions and binding free energy calculations to identify putative binders. The availability of 3D structures of the compounds to be used for docking is of utmost importance. Therefore the availability of such structures within CamMedNP, as well as their calculated physico-chemical properties and indicators of “drug-likeness” within this newly developed database will facilitate the drug discovery process from leads that have been identified from Cameroonian medicinal plants.
Availability and requirements
3D structures of the compounds, as well as their physico-chemical properties that were used to evaluate “drug-likeness”, can be freely downloaded (for non commercial uses) as a supplementary file accompanying this publication (Additional file 2). Physical samples for testing are available at the various research labs in Cameroon in varying quantities. Questions regarding the available of compound samples could be addressed directly to the authors of this paper. Otherwise samples could be obtainable from the p-ANAPL consortium, which has a mandate to collect samples of NPs from the entire continent of Africa and make them available for biological screening. This network is being set up under the auspices of the Network for Analytical and Bioassay Services in Africa (NABSA) [31, 32].
WS and SMNE are professors of medicinal chemistry with an interest in CADD, while SMNE also focuses organic synthesis and on natural product leads from Cameroonian medicinal plants. LMM and JAM are natural product chemists actively involved in the isolation and characterization of secondary metabolites from Cameroonian medicinal plants. FCN is a biochemist/molecular biologists interested in docking and in silico screening. LLL holds a PhD in environmental chemistry and manages a Chemical and Bioactivity Information centre with a focus on developing databases for information from medicinal herbs in Africa. FNK is a PhD student working on CADD under the joint supervision of LCOO and EM, while PAO is an MSc student supervised by LMM, MS is a PhD student under the supervision of WS and JNH is a PhD student supervised by SMNE.
Absorption, distribution, metabolism, excretion, and toxicology
Computer-aided drug design
Cameroonian Medicinal Plant and Natural Products Database
Drug metabolism and pharmacokinetics
Dictionary of Natural Products
Hydrogen bond acceptors
Hydrogen bond donors
- log P:
logarithm of the octan-1-ol/water partition coefficient
Network for Analytical and Bioassay Services in Africa
Number of nitrogens
Number of oxygens
Number of rings
Number of rotatable bonds
pan-African Natural Products Library
Principal component analysis
Total polar surface area
This article is dedicated to the memory of the late Professors David Lontsi, Johnson Foyere Ayafor, and Zacharias Tanee Fomum for their significant contributions towards the development of natural product research in Cameroon. Financial support is acknowledged from the German Academic Exchange Service (DAAD) to FNK for his stay in Halle, Germany for part of his PhD. The authors are very grateful to all researchers and librarians who contributed by granting access to useful data and making useful suggestions, and to the referees for criticizing the manuscript. We are also grateful to Mr. Nkoh Jackson Nkoh and Ms. Verkejika Vivian Vera-Nso for assisting in generating some of the 3D structures. The assistance of Dr. Philip N. Judson (Chemical and Bioactivity Information Centre, Leeds, UK) is also acknowledged for proofreading the manuscript.
- Akerele O: In Summary of WHO guidelines for the assessment of herbal medicine. Herbalgram. 1993, 28: 13-19.
- Li JWH, Vederas JC: Drug discovery and natural products: end of an era or an endless frontier?. Science. 2009, 325: 161-165. 10.1126/science.1168243.View ArticlePubMed
- Chin YW, Balunas MJ, Chai HB, Kinghorn AD: Drug discovery from natural sources. The AAPS Journal. 2006, 8 (2): E239-E253.PubMed CentralView ArticlePubMed
- Potterat O, Hamburger M, In Progress in drug research: natural compounds as drugs: Drug discovery and development with plant-derived compounds. 2008, Basel, Birhäusser Verlag AG: Edited by Petersen F, Amstutz R, 45-118.
- Quinn RJ, Carroll AR, Pham MB, Baron P, Palframan ME, Suraweera L, Pierens GK, Muresan S: Developing a drug-like natural product library. J Nat Prod. 2008, 71: 464-468. 10.1021/np070526y.View ArticlePubMed
- Newman DJ: Natural products as leads to potential drugs: an old process or the new hope for drug discovery?. J Med Chem. 2008, 51: 2589-2599. 10.1021/jm0704090.View ArticlePubMed
- Harvey AL: Natural products in drug discovery. Drug Discov Today. 2008, 13: 894-901. 10.1016/j.drudis.2008.07.004.View ArticlePubMed
- Koehn FE, Carter GT: The evolving role of natural products in drug discovery. Nat Rev Drug Discov. 2005, 4: 206-220. 10.1038/nrd1657.View ArticlePubMed
- Klebe G: Virtual ligand screening: strategies, perspectives and limitations. Drug Discov Today. 2006, 11: 580-594. 10.1016/j.drudis.2006.05.012.View ArticlePubMed
- Kubinyi H: Structure-based design of enzyme inhibitors and receptor ligands. Curr Opin Drug Discov Develop. 1998, 1: 4-15.
- Fullbeck M, Michalsky E, Dunkel M, Preissner R: Natural products: sources and databases. Nat Prod Rep. 2006, 23: 347-356. 10.1039/b513504b.View ArticlePubMed
- Dunkel M, Fullbeck M, Neumann S, Preissner R: SuperNatural: a searchable database of available natural compounds. Nucleic Acids Res. 2006, 34: D678-D683. 10.1093/nar/gkj132.PubMed CentralView ArticlePubMed
- Qiao X, Hou T, Zhang W, Guo S, Xu X: A 3D structure database of components from Chinese traditional medicinal herbs. J Chem Inf Comput Sci. 2002, 42: 481-489. 10.1021/ci010113h.View ArticlePubMed
- Lei J, Zhou J: A marine natural product database. J Chem Inf Comput Sci. 2002, 42: 742-748. 10.1021/ci010111x.View ArticlePubMed
- Blunt JW, Copp BR, Munro MHG, Northcote PT, Prinsep MR: Marine natural products. Nat Prod Rep. 2004, 21: 1-49. 10.1039/b305250h.View ArticlePubMed
- López-Pérez JL, Therón R, del Olmo E, Díaz D: NAPROC-13: a database for the dereplication of natural product mixtures in bioassay-guided protocols. Bioinformatics. 2007, 23: 3256-3257. 10.1093/bioinformatics/btm516.View ArticlePubMed
- Daisy P, Singh SK, Vijayalakshmi P, Selvaraj C, Rajalakshmi M, Suveena S: A database for the predicted pharmacophoric features of medicinal compounds. Bioinformation. 2011, 6 (4): 167-168. 10.6026/97320630006167.PubMed CentralView ArticlePubMed
- Pitchai D, Manikkam R, Rajendran SR, Pitchai G: Database on pharmacophore analysis of active principles, from medicinal plants. Bioinformation. 2010, 5 (2): 43-45. 10.6026/9732063000543.PubMed CentralView ArticlePubMed
- Sangma C, Chuakheaw D, Jongkon N, Saenbandit K, Nunrium P, Uthayopas P, Hannongbua S: Virtual screening for anti-HIV-1 RT and anti-HIV-1 PR inhibitors from the Thai medicinal plants database: a combined docking with neural networks approach. Chem High Throughput Screen. 2005, 8 (5): 417-429. 10.2174/1386207054546469.View Article
- Hostettmann K, Marston A, Ndjoko K, Wolfender JL: The potential of African plants as a source of drugs. Curr Org Chem. 2000, 4: 973-1010. 10.2174/1385272003375923.View Article
- Kuete V, Efferth T: Cameroonian medicinal plants: pharmacology and derived natural products. Frontiers in Pharmacology. 2010, 1: 123-PubMed CentralView ArticlePubMed
- Kuete V: Potential of Cameroonian plants and derived products against microbial infections: a review. Planta Med. 2010, 76: 1479-1491. 10.1055/s-0030-1250027.View ArticlePubMed
- Efange SMN, In Advances in Phytomedicine: Natural products: a continuing source of inspiration for the medicinal chemist. 2002, Amsterdam, Elsevier Science: Edited by Iwu MM, Wootton JC, 61-69.
- Chemical Computing Group Inc: Molecular Operating Environment Software. 2010, Montreal
- Halgren TA: Merck molecular forcefield. J Comput Chem. 1996, 17: 490-641. 10.1002/(SICI)1096-987X(199604)17:5/6<490::AID-JCC1>3.0.CO;2-P.View Article
- ChemBridge Corporation: [http://chembridge.com/
- Lipinski CA, Lombardo F, Dominy BW, Feeney PJ: Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Delivery Rev. 1997, 23: 3-25. 10.1016/S0169-409X(96)00423-1.View Article
- Chapman and Hall/CRC Press: Dictionary of Natural Products on CD-Rom. 2005, London
- Feher M, Schmidt JM: Property distributions: differences between drugs, natural products, and molecules from combinatorial chemistry. J Chem Inf Comput Sci. 2003, 43: 218-227. 10.1021/ci0200467.View ArticlePubMed
- Core R: Team. R: A Language and Environment for Statistical Computing. 2012, Vienna: R Foundation for Statistical Computing, [http://www.R-project.org]
- Chibale K, Davies-Coleman M, Masimirembwa C:Drug discovery in Africa. 2012, Springer: impacts of genomics, natural products, traditional medicines, insights into medicinal chemistry, and technology platforms in pursuit of new drugs,View Article
- pan-ANAPL: pan-African Natural Products Library, [http://www.linkedin.com/groups/pANPL-4098579/about]
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1472-6882/13/88/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.