药学学报, 2018, 53(6): 929-937
引用本文:
阿基业, 何骏, 孙润彬. 代谢组学数据处理——主成分分析十个要点问题[J]. 药学学报, 2018, 53(6): 929-937.
A Ji-ye, HE Jun, SUN Run-bin. Multivariate statistical analysis for metabolomic data: the key points in principal component analysis[J]. Acta Pharmaceutica Sinica, 2018, 53(6): 929-937.

代谢组学数据处理——主成分分析十个要点问题
阿基业, 何骏, 孙润彬
中国药科大学药物代谢动力学重点实验室, 代谢组学研究室, 药物分子设计与成药性优化重点实验室, 天然药物活性组分与药效国家重点实验室, 江苏 南京 210009
摘要:
代谢组学研究所产生多变量数据常采用主成分分析方法进行处理和评价,主成分分析涉及抽象的空间模型、复杂的理论计算、精细的数据转换,需要准确理解和把握主成分分析算法原理和特点。本文从主成分、主成分得分、主成分载荷、缩放与权重、偏最小二乘关联分析与判别分析、隐结构正交投影分析、隐结构双向正交投影分析、S-形图、共享与特有化合物结构分析、模型验证等十个方面,以简洁、易懂的语言介绍了代谢组学数据处理常用的主成分分析方法中的重点和难点问题,方便广大代谢组学研究人员更好地熟悉和了解代谢组学数据处理方法,以合理选择数据处理模式、规范数据处理程序、熟练解析数据处理结果,并得出可靠结论。
关键词:    代谢组学      主成分分析      系统生物学      多变量数据分析      主成分     
Multivariate statistical analysis for metabolomic data: the key points in principal component analysis
A Ji-ye, HE Jun, SUN Run-bin
Jiangsu Province Key Laboratory of Drug Metabolism and Pharmacokinetics, Laboratory of Metabolomics, Jiangsu Key Laboratory of Drug Design and Optimization, State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing 210009, China
Abstract:
Metabolomics data contains multiple variables usually processed and evaluated by means of principal components analysis. The statistical analysis of the multivariate data is involved in abstract, elusory fitting for the model of hyperspace, complicated theoretical arithmetic and sophisticated transformation of the data matrix. It is crucially important to understand the arithmetic mechanism and the properties of the models fully. In this article, we reviewed the key and puzzling issues in principal components analysis of the metabolomics data, including the principal components, the scores and loadings of a principal components, scaling and weighting, partial least square projection to latent structures, partial least squares discriminant analysis, orthogonal projection to latent structure, orthogonal bidirectional projections to latent structures, S-plot, shared and unique structure plot, and the validation of the model. Hopefully, this article provides a better understanding of data processing mode, model selection, procedure standardization, and data interpretation for a reliable conclusion.
Key words:    metabolomics    principal component analysis    system biology    multivariate statistical analysis    principal component   
收稿日期: 2017-12-25
DOI: 10.16438/j.0513-4870.2017-1288
基金项目: 中国新西兰政府间科技合作重点项目(采用中药治疗耳鸣:以代谢网络为目标,2017YFE0109600);“十三五”国家重大新药创新专项“基于药动−药效结合的组分中药与创新中药研发关键技术”(2017ZX09301-013).
相关功能
PDF(408KB) Free
打印本文
0
作者相关文章
阿基业  在本刊中的所有文章
何骏  在本刊中的所有文章
孙润彬  在本刊中的所有文章

参考文献:
[1] Nicholson JK, Lindon JC, Holmes E. ‘Metabonomics’: understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data[J]. Xenobiotica, 1999, 29:1181-1189.
[2] Tang HR, Wang YL. Metabonomics:a revolution in pro-gress[J]. Prog Biochem Biophys (生物化学与生物物理进展), 2006, 33:401-417.
[3] Jia W. Medical Metabonomics (医学代谢组学)[M]. Shanghai:Shanghai Scientific & Technical Publisher, 2011.
[4] Li B, He X, Jia W, et al. Novel applications of metabolom-ics in personalized medicine:a mini-review[J]. Molecules, 2017, 22:1173.
[5] Shi J, Cao B, Wang XW, et al. Metabolomics and its application to the evaluation of the efficacy and toxicity of traditional Chinese herb medicines[J]. J Chromatogr B, 2016, 1026:204-216.
[6] Alden N, Krishnan S, Porokhin V, et al. Biologically consistent annotation of metabolomics data[J]. Anal Chem, 2017, 89:13097-13104.
[7] Xu GW. Metabolomics:Methods and Applications (代谢组学——方法与应用)[M]. Beijing:Science Press, 2008.
[8] Zhang AH, Sun H, Wang P, et al. Modern analytical techniques in metabolomics analysis[J]. Analyst, 2012, 137:293-300.
[9] Johnson CH, Ivanisevic J, Siuzdak G. Metabolomics:beyond biomarkers and towards mechanisms[J]. Nat Rev Mol Cell Biol, 2016, 17:451-459.
[10] Sevin DC, Kuehne A, Zamboni N, et al. Biological insights through nontargeted metabolomics[J]. Curr Opin Biotechnol, 2015, 34:1-8.
[11] De Raad M, Fischer CR, Northen TR. High-throughput platforms for metabolomics[J]. Curr Opin Chem Biol, 2016, 30:7-13.
[12] Yin PY, Xu GW. Current state-of-the-art of nontargeted metabolomics based on liquid chromatography-mass spec-trometry with special emphasis in clinical applications[J]. J Chromatogr A, 2014, 1374:1-13.
[13] Wang Y, Gu H, Lu X, et al. Development of hydrophilic interaction chromatographic hyphenated techniques and their applications[J]. Chin J Chromatogr (色谱), 2008, 26:649- 657.
[14] Tang HR, Xiao CN, Wang YL. Important roles of the hyphenated HPLC-DAD-MS-SPE-NMR technique in metabonomics[J]. Magn Reson Chem, 2009, 47:S157-S162.
[15] Shi XZ, Qiao LZ, Xu GW. Recent development of ionic liquid stationary phases for liquid chromatography[J]. J Chromatogr A, 2015, 1420:1-15.
[16] Aa JY. Analysis of metabolomic data:principal component analysis[J]. Chin J Clin Pharmacol Ther (中国临床药理学与治疗学), 2010, 15:481-489.
[17] Cambiaghi A, Ferrario M, Masseroli M. Analysis of metabolomic data:tools, current strategies and future chal-lenges for omics data integration[J]. Brief Bioinform, 2016, 18:498-510.
[18] Tautenhahn R, Patti GJ, Rinehart D, et al. XCMS online:a web-based platform to process untargeted metabolomic data[J]. Anal Chem, 2012, 84:5035-5039.
[19] Axelson DE. Data Preprocessing for Chemometric and Metabonomic Analysis[M]. CreateSpace Independent Publishing, 2012.
[20] Fiehn O, Kind T, Barupal DK. Data processing, me-tabolomic databases and pathway analysis[M]//Annual Plant Reviews:Biology of Plant Metabolomics. Vol 43. West Sussex, UK:Wiley-Blackwell, 2011.
[21] Smilde AK, Westerhuis JA, Hoefsloot HCJ, et al. Dynamic metabolomic data analysis:a tutorial review[J]. Me-tabolomics, 2010, 6:3.
[22] Eriksson L, Johansson E, Kettaneh-Wold N, et al. Multi- and Megavariate Data Analysis:Principles and Applications[M]. Umeå:Umetrics Academy, 2001.
[23] Aa JY, Wang GJ, Hao HP, et al. Differential regulations of blood pressure and perturbed metabolism by total ginsenosides and conventional antihypertensive agents in spontaneously hypertensive rats[J]. Acta Pharmacol Sin, 2010, 31:930- 937.
[24] Liu LS, Aa JY, Wang GJ, et al. Differences in metabolite profile between blood plasma and serum[J]. Anal Biochem, 2010, 406:105-112.
[25] Trygg J, Wold S. Orthogonal projections to latent structures (O-PLS)[J]. J Chemom, 2002, 16:119-128.
[26] Westerhuis JA, Van VEJJ, Hoefsloot HCJ, et al. Multivariate paired data analysis:multilevel PLSDA versus OPLSDA[J]. Metabolomics, 2010, 6:119-128.
[27] Trygg J, Wold S. O2-PLS, a two-block (X-Y) latent variable regression (LVR) method with an integral OSC filter[J]. J Chemom, 2003, 17:53-64.
[28] Wiklund S, Johansson E, Sjostrom L, et al. Visualization of GC/TOF-MS-based metabolomics data for identification of biochemically interesting compounds using OPLS class models[J]. Anal Chem, 2008, 80:115-122.
相关文献:
1.余玲飞, 胡容峰, 苏丹, 方文悠, 王斌, 高松.微晶纤维素流动性的表征及其性能参数相关性的可视化[J]. 药学学报, 2018,53(5): 806-811
2.王翠, 贾雪洋, 侯璐文, 秦雪梅, 李建国.基于GC-MS代谢组学研究抑郁大鼠血浆代谢物变化规律[J]. 药学学报, 2018,53(6): 980-986
3.龚小红, 周忆梦, 郑立, 汤韵秋, 龚莉虹, 余琳媛, 李芸霞, 彭成.大黄治疗阳虚便秘模型大鼠的整合PK/PD研究[J]. 药学学报, 2018,53(4): 561-566
4.张春泥, 王英姿, 孙欣光, 赵阳, 郑伟, 李文华, 龙珍, 马百平.HPLC-CAD结合化学计量学的川楝子饮片指纹图谱研究[J]. 药学学报, 2017,52(3): 456-461
5.李淑娇, 李伟霞, 唐于平, 沈娟, 尚尔鑫, 郭建明, 段金廒.基于主成分分析和多指标综合指数法的当归-红花不同配比活血化瘀作用比较[J]. 药学学报, 2014,49(9): 1304-1309
6.黄青, 阿基业, 周国华.基于药物代谢组学的个体化医疗研究进展[J]. 药学学报, 2014,49(11): 1491-1497
7.李振皓, 刘 培, 钱大玮, 李 炜, 尚尔鑫, 段金廒.主成分分析用于香附四物汤效应部位体外经皮渗透的研究[J]. 药学学报, 2013,48(6): 933-939
8.赵立杰,冯怡,徐德生,阮克锋,洪燕龙,付小菊.基于多元数据分析研究中药制剂原料吸湿性与其他物理特性的相关性[J]. 药学学报, 2012,47(4): 517-521
9.杜 焰, 赵立杰, 熊耀坤, 李晓海, 王松涛, 冯 怡, 徐德生.药用粉体流动性的多元分析方法表征[J]. 药学学报, 2012,47(9): 1231-1236
10.徐 英 陈崇崇 杨 莉 王君明 季莉莉 王峥涛 胡之璧.基于胆汁酸代谢网络分析中药黄药子的肝毒性[J]. 药学学报, 2011,46(1): 39-44
11.董 梁 张翠英 陈士林.西洋参药材皂苷类成分HPLC-UV-ELSD特征图谱及模式识别研究[J]. 药学学报, 2011,46(2): 198-202
12.李小芩 孙晓红 蔡爽 英锡相 李发美.采用UPLC-ESI-MS/MS以及主成分聚类分析研究不同品种金银花的化学成分及其差异(英文)[J]. 药学学报, 2009,44(8): 895-904
13.李晶;吴晓健;刘昌孝;元英进.代谢组学研究中数据处理新方法的应用[J]. 药学学报, 2006,41(1): 47-53
14.丁俊杰;丁晓琴;赵立峰;陈冀胜.新型三维氨基酸结构描述符的研究及其在多肽QSAR中的应用[J]. 药学学报, 2005,40(4): 340-346
15.张亮;马国祥;张正行;徐国钧;安登魁.中药石斛质量的化学模式识别[J]. 药学学报, 1994,29(4): 290-295