High-Order Correlation Integration for Single-Cell or Bulk RNA-seq Data Analysis
2019-04-26
发表期刊FRONTIERS IN GENETICS
ISSN1664-8021
卷号10
发表状态已发表
DOI10.3389/fgene.2019.00371
摘要Quantifying or labeling the sample type with high quality is a challenging task, which is a key step for understanding complex diseases. Reducing noise pollution to data and ensuring the extracted intrinsic patterns in concordance with the primary data structure are important in sample clustering and classification. Here we propose an effective data integration framework named as HCI (High-order Correlation Integration), which takes an advantage of high-order correlation matrix incorporated with pattern fusion analysis (PFA), to realize high-dimensional data feature extraction. On the one hand, the high-order Pearson's correlation coefficient can highlight the latent patterns underlying noisy input datasets and thus improve the accuracy and robustness of the algorithms currently available for sample clustering. On the other hand, the PFA can identify intrinsic sample patterns efficiently from different input matrices by optimally adjusting the signal effects. To validate the effectiveness of our new method, we firstly applied HCI on four single-cell RNA-seq datasets to distinguish the cell types, and we found that HCI is capable of identifying the prior-known cell types of single-cell samples from scRNA-seq data with higher accuracy and robustness than other methods under different conditions. Secondly, we also integrated heterogonous omics data from TCGA datasets and GEO datasets including bulk RNA-seq data, which outperformed the other methods at identifying distinct cancer subtypes. Within an additional case study, we also constructed the mRNA-miRNA regulatory network of colorectal cancer based on the feature weight estimated from HCI, where the differentially expressed mRNAs and miRNAs were significantly enriched in well-known functional sets of colorectal cancer, such as KEGG pathways and IPA disease annotations. All these results supported that HCI has extensive flexibility and applicability on sample clustering with different types and organizations of RNA-seq data.
关键词high-order integration clustering single-cell bulk data analysis
收录类别SCI ; SCIE
语种英语
资助项目Natural Science Foundation of Shanghai[17ZR1446100]
WOS研究方向Genetics & Heredity
WOS类目Genetics & Heredity
WOS记录号WOS:000466207300001
出版者FRONTIERS MEDIA SA
WOS关键词GENE-EXPRESSION ; SIGNALING PATHWAYS ; DISCOVERY ; MODULES ; HETEROGENEITY ; EMBRYOS ; COLON ; MAPK
原始文献类型Article
引用统计
文献类型期刊论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/37520
专题生命科学与技术学院_特聘教授组_陈洛南组
通讯作者Zeng, Tao; Chen, Luonan
作者单位
1.Univ Chinese Acad Sci, CAS Ctr Excellence Mol Cell Sci, Inst Biochem & Cell Biol, Shanghai Inst Biol Sci,Key Lab Syst Biol,Chinese, Shanghai, Peoples R China
2.Chinese Acad Sci, CAS Ctr Excellence Anim Evolut & Genet, Kunming, Yunnan, Peoples R China
3.ShanghaiTech Univ, Sch Life Sci & Technol, Shanghai, Peoples R China
4.Shanghai Res Ctr Brain Sci & Brain Inspired Intel, Shanghai, Peoples R China
通讯作者单位生命科学与技术学院
推荐引用方式
GB/T 7714
Tang, Hui,Zeng, Tao,Chen, Luonan. High-Order Correlation Integration for Single-Cell or Bulk RNA-seq Data Analysis[J]. FRONTIERS IN GENETICS,2019,10.
APA Tang, Hui,Zeng, Tao,&Chen, Luonan.(2019).High-Order Correlation Integration for Single-Cell or Bulk RNA-seq Data Analysis.FRONTIERS IN GENETICS,10.
MLA Tang, Hui,et al."High-Order Correlation Integration for Single-Cell or Bulk RNA-seq Data Analysis".FRONTIERS IN GENETICS 10(2019).
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Tang, Hui]的文章
[Zeng, Tao]的文章
[Chen, Luonan]的文章
百度学术
百度学术中相似的文章
[Tang, Hui]的文章
[Zeng, Tao]的文章
[Chen, Luonan]的文章
必应学术
必应学术中相似的文章
[Tang, Hui]的文章
[Zeng, Tao]的文章
[Chen, Luonan]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 10.3389@fgene.2019.00371.pdf
格式: Adobe PDF
此文件暂不支持浏览
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。