Seurat单细胞处理流程之七:细胞通讯分析

rm(list = ls())
setwd("/mnt/DEV_8T/zhaozm/seurat全流程/cellchat/")

##禁止转化为因子
options(stringsAsFactors = FALSE)

## 设置保存的目录
## 数据目录
data_dir <- "/mnt/DEV_8T/zhaozm/seurat全流程/cellchat/data"
if (!dir.exists(data_dir)) {
  dir.create(data_dir, recursive = TRUE)
}
## 图片目录
img_dir <- "/mnt/DEV_8T/zhaozm/seurat全流程/cellchat/img"

if (!dir.exists(img_dir)) {
  dir.create(img_dir, recursive = TRUE)
}

## 加载包
library(CellChat)
library(Seurat)
library(readr)
载入需要的程序包:dplyr


载入程序包:‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union


载入需要的程序包:igraph


载入程序包:‘igraph’


The following objects are masked from ‘package:dplyr’:

    as_data_frame, groups, union


The following objects are masked from ‘package:stats’:

    decompose, spectrum


The following object is masked from ‘package:base’:

    union


载入需要的程序包:ggplot2

载入需要的程序包:SeuratObject

载入需要的程序包:sp


载入程序包:‘SeuratObject’


The following object is masked from ‘package:BiocGenerics’:

    intersect


The following objects are masked from ‘package:base’:

    intersect, t



载入程序包:‘Seurat’


The following object is masked from ‘package:igraph’:

    components
## 读取注释后的seurat文件
pbmc <- read_rds(file = "../差异富集/data/cd8注释后.rds")
#我们的数据分为两组,敏感组和耐药组,分别提取数据
unique(pbmc$group)

<ol class=list-inline><li>‘Resistant’</li><li>‘Pre’</li><li>‘Sensitive’</li></ol>

RE <- subset(pbmc,group=='Resistant')
SE <- subset(pbmc,group=='Sensitive')
#我们首先分析单个组的通讯情况
RE_input <- GetAssayData(RE, layer = 'data')
RE_meta <- RE@meta.data[,c("group","celltype")]
colnames(RE_meta) <-  c("group","labels")
#保证meta的行名和input matrix的列名是一致的
identical(colnames(RE_input),rownames(RE_meta)) 
# [1] TRUE

TRUE

#创造cellchat对象
RE.cellchat <- createCellChat(object = RE_input, meta = RE_meta, group.by = "labels")
# cellchat = createCellChat(RE, assay = 'SCT',group.by = 'cell_type')## 如果是使用SCT标准化的话

#添加细胞信息
levels(RE.cellchat@idents) # show factor levels of the cell labels
groupSize <- as.numeric(table(RE.cellchat@idents)) # number of cells in each cell group
[1] "Create a CellChat object from a data matrix"


Warning message in createCellChat(object = RE_input, meta = RE_meta, group.by = "labels"):
“The 'meta' data does not have a column named `samples`. We now add this column and all cells are assumed to belong to `sample1`! 
”


Set cell identities for the new CellChat object 
The cell groups used for CellChat analysis are  Cytotox_CD8, Naive_CD8, Tem_CD8, Tex_CD8, Trm_CD8 

<ol class=list-inline><li>‘Cytotox_CD8’</li><li>‘Naive_CD8’</li><li>‘Tem_CD8’</li><li>‘Tex_CD8’</li><li>‘Trm_CD8’</li></ol>

#设置 ligand-receptor interaction 数据库 #在使用CellChat推断细胞之间的通讯之前,需要设置配体-受体相互作用数据库,并识别过表达的配体或受体。 #CellChatDB数据库整理有人和鼠的数据 #CellChatDB v2包含约3300个经过验证的分子相互作用,包括约40%的分泌自分泌/旁分泌信号相互作用,约17%的细胞外基质(ECM)受体相互作用,约13%的细胞-细胞接触相互作用和约30%的非蛋白信号。 #相比于之前的版本,CellChatDB v2包含有更多的受配体数据 #cellchat的CellChatDB数据库也支持用户自行修改

## 选择人的数据库
CellChatDB <- CellChatDB.human # use CellChatDB.mouse if running on mouse data
showDatabaseCategory(CellChatDB)

png

## 也可以从 CellChatDB 数据库中提取 “Secreted Signaling”(分泌信号)相关的配体-受体相互作用信息,而不是使用整个数据库。
CellChatDB.use <- subsetDB(CellChatDB, search = "Secreted Signaling", key = "annotation")
showDatabaseCategory(CellChatDB.use)

png

# cellchat obj中设置使用的数据库
RE.cellchat@DB <- CellChatDB
#预处理细胞-细胞通讯分析的表达数据
#为了推断细胞状态特异性的通信,CellChat在一个细胞组中识别过表达的配体或受体,然后在配体或受体过表达时识别过表达的配体-受体相互作用。
#subsetData可以选择expression data of signaling genes in CellChatDB.use,
#比如选择"Cell-Cell Contact"等。但是需要注意的是,即使你使用全部的数据库,这一步也是需要跑的
RE.cellchat <- subsetData(RE.cellchat) 
future::plan("multisession", workers = 4) # 并行计算
RE.cellchat <- identifyOverExpressedGenes(RE.cellchat)
RE.cellchat <- identifyOverExpressedInteractions(RE.cellchat)
#默认情况下,cellchat使用object@data.signaling进行网络推断
#但是他也提供了projectData函数,将基因投射到PPI,两者各有优势,原作者说PPI并不会带来假通讯
#或者说很低。具体选取哪种方式,可以做个对比,我们这里使用默认的
# cellchat <- projectData(cellchat, PPI.human)
Warning message in getCGroupsRoot(controller = controller):
“Mixed CGroups versions are not supported: ‘cgroup2’, ‘cgroup’”


The number of highly variable ligand-receptor pairs used for signaling inference is 446 

推断的配体-受体对的数量显然取决于计算每个细胞组平均基因表达的方法。默认情况下,CellChat使用一种称为“trimean”的统计稳健均值方法,该方法产生的交互比其他方法少。然而,我们发现CellChat在预测更强的相互作用方面表现良好,这对于缩小相互作用范围以进行进一步的实验验证非常有帮助。 要推断网络,首先要计算celltype的平均基因能表达,而且computeCommunProb推断的时候会设置平均值cut,那么每种方法的cut是不一样的。cellchat提供了多个方法,默认是一种叫做triMean的方法。这种方法呢产生的interaction会少一点,但是都是interaction较强的。那么,假设你的推断中没有出现你生物学意义观测的interaction,可以选择换一种方法。例如truncatedMean,选择较低的cut值。

#推断细胞通讯网络
RE.cellchat <- computeCommunProb(RE.cellchat, type = "triMean")
triMean is used for calculating the average gene expression per cell group. 
[1] ">>> Run CellChat on sc/snRNA-seq data <<< [2025-03-27 10:56:48.102831]"
[1] ">>> CellChat inference is done. Parameter values are stored in `object@options$parameter` <<< [2025-03-27 10:59:29.870763]"
#如果在某些细胞组中只有少数细胞,用户可以过滤掉细胞间通信。默认情况下,每个细胞组中进行细胞间通信所需的最小细胞数为10
#这个过滤是有意义的,毕竟如果只有少出几颗细胞,那么结果太具有偶然性了
RE.cellchat <- filterCommunication(RE.cellchat, min.cells = 10)
table(RE$celltype)
The cell-cell communication related with the following cell groups are excluded due to the few number of cells:  Cytotox_CD8 ! 	0.0% interactions are removed!




Cytotox_CD8   Naive_CD8     Tem_CD8     Tex_CD8     Trm_CD8 
          9          35         654         416        3619 
#在信号通路水平推断细胞间的通讯
#CellChat通过汇总与每个信号通路相关的所有配体-受体相互作用的通信概率来计算信号通路水平上的通信概率。
#NB: The inferred intercellular communication network of each ligand-receptor pair and each signaling pathway is stored in the slot ‘net’ and ‘netP’, respectively.
RE.cellchat <- computeCommunProbPathway(RE.cellchat)

#互作网络整合,可以设置soure和target,我们这里默认了,就是全部的
RE.cellchat <- aggregateNet(RE.cellchat)
# RE.cellchat <- aggregateNet(RE.cellchat,sources.use = )
#至此,整个推断就结束了,过程很简单,比比较快
## 保存一下分析结果
write_rds(RE.cellchat,file = "./data/RE_cd8_cellchat.rds")
## 流程太冗长了,写一个函数吧
fun_cellchat <- function(input_obj,
                        assay= NULL,
                        group.by = NULL,
                        workers,
                        species=c('human','mouse'),
                        CellChatDB.use=NULL,#a character vector, which is a subset of c("Secreted Signaling","ECM-Receptor","Cell-Cell Contact","Non-protein Signaling")
                        PPIuse=F,
                        type="triMean",# c("triMean", "truncatedMean", "thresholdedMean", "median")
                        min.cells = 10
){
  
  
  
  cellchat.obj = createCellChat(input_obj, assay = assay, group.by = group.by)
  
  if(species=='human'){
    
    CellChatDB <- CellChatDB.human
    ppi = PPI.human
  }
  
  if(species =="mouse"){
    
    CellChatDB <- CellChatDB.mouse
    ppi = PPI.mouse
  }
  
  
  if(is.null(CellChatDB.use)){
    
    cellchat.obj@DB <- CellChatDB
    
  }else{
    
    CellChatDB <- subsetDB(CellChatDB, search = CellChatDB.use, key = "annotation")
    cellchat.obj@DB <- CellChatDB
  }
  
  cellchat.obj <- subsetData(cellchat.obj) 
  future::plan("multisession", workers = workers) # 并行计算
  cellchat.obj <- identifyOverExpressedGenes(cellchat.obj)
  cellchat.obj <- identifyOverExpressedInteractions(cellchat.obj)
  
  if(PPIuse==F){
    
    cellchat.obj <- computeCommunProb(cellchat.obj, type = type)
    
  }else{
    
    cellchat.obj <- projectData(cellchat.obj, ppi)
    cellchat.obj <- computeCommunProb(cellchat.obj, raw.use=F, type = type)
  }
  
  
  cellchat.obj <- filterCommunication(cellchat.obj, min.cells = min.cells)
  cellchat.obj <- computeCommunProbPathway(cellchat.obj)
  
  #互作网络整合,可以设置soure和target,我们这里默认了,就是全部的
  cellchat.obj <- aggregateNet(cellchat.obj)
  
  return(cellchat.obj)
  
}

## 调用一下上面的函数
SE.cellchat <- fun_cellchat(SE, group.by = "celltype",
                            workers=4, species='human')
[1] "Create a CellChat object from a Seurat object"
The `data` slot in the default assay is used. The default assay is RNA 
The `meta.data` slot in the Seurat object is used as cell meta information 


Warning message in createCellChat(input_obj, assay = assay, group.by = group.by):
“The 'meta' data does not have a column named `samples`. We now add this column and all cells are assumed to belong to `sample1`! 
”


Set cell identities for the new CellChat object 
The cell groups used for CellChat analysis are  Cytotox_CD8, Naive_CD8, Tem_CD8, Tex_CD8, Trm_CD8 
The number of highly variable ligand-receptor pairs used for signaling inference is 1005 
triMean is used for calculating the average gene expression per cell group. 
[1] ">>> Run CellChat on sc/snRNA-seq data <<< [2025-03-27 14:38:43.867781]"
[1] ">>> CellChat inference is done. Parameter values are stored in `object@options$parameter` <<< [2025-03-27 14:42:21.00054]"
## 保存
write_rds(SE.cellchat,file = "./data/SE_cd8_cellchat.rds")