本草基因编码天然多样性成分库(GNDC) 在中医药研究中的应用
Application of the gene-encoded natural diversity components repository (GNDC) in traditional Chinese medicine research
-
摘要: 天然成分作为物种适应环境与共进化的产物, 是药物发现的重要源泉。根据中心法则的遗传规律, 将天然成分分为基因直接编码的成分(核酸和多肽) 和基因间接编码的成分(初级代谢产物和次生代谢产物)。作者利用多组学和人工智能技术, 对全球八大权威药典中收载的1 037个物种的核基因组及细胞器基因组进行解析和注释, 并整合多维数据资源构建了全球首个亿量级本草基因编码的天然多样性成分库(gene-encoded natural diversity components repository, GNDC, https://cbcb.cdutcm.edu.cn/gndc/)。本论文将系统介绍: ① GNDC成分库的构建方法, 包括数据整合策略、注释流程及数据处理; ②核心功能与数据规模: GNDC目前收录超过2.34亿个基因直接或间接编码的天然成分包含四大专业子库: HerbalMDB (次级代谢产物): 232万个小分子化合物; HerbalPDB (小肽): 2.29亿条小肽; HerbalRDB (小RNA): 238万条小RNA; HerbalCDB (碳水化合物): 26万个碳水化合物; ③探讨GNDC在中医药现代化研究中的应用。GNDC提供了亿量级的天然成分, 将为药物发现提供前所未有的广阔“化学空间”, 有力驱动药物研发模式从“经验导向”向“大数据驱动”的转变, 为中医药现代化研究带来新的范式。Abstract: Natural components are crucial sources for drug discovery. Based on the central dogma of molecular biology, we proposed a novel paradigm for classifying natural products by categorizing natural components into "direct gene-encoded components" (including nucleic acids and peptides) and "indirect gene-encoded components" (encompassing primary metabolites and secondary metabolites). Utilizing multi-omics and artificial intelligence technologies, we systematically analyzed the nuclear and organellar genomes of 1 037 medicinal species sourced from eight authoritative global pharmacopeias, integrating multidimensional data resources to establish the gene-encoded natural diversity components repository (GNDC). This paper comprehensively describes: (1) The construction methodology of GNDC, including data integration strategies, standardized annotation pipelines, and rigorous quality control; (2) Core features and data scale: GNDC is currently the world's largest repository of medicinal natural products, housing over 234 million gene-encoded (directly or indirectly) natural components. It encompassed four specialized sub-databases: HerbalMDB for 2.32 million secondary metabolites, HerbalPDB for 229 million small peptides, HerbalRDB for 2.38 million small RNAs, and HerbalCDB for 0.26 million carbohydrates; (3) Exploring the application prospects of GNDC in the modernization of traditional medicine research. GNDC will provide an unprecedented expansive "chemical space" for drug discovery. It will powerfully drive a paradigm shift in drug development from "experience-oriented" to "big data-driven" approaches, offering a transformative framework for traditional medicine research.
下载: