我试图用样本树状图制作热图和层次聚类。
我正在尝试遵循 StackOverflow 中的这个特定线程(合并多个 hclust 对象(或树状图))
我有一个很大的数据框,
head(joined_df_sorted)
# A tibble: 6 × 24
chrom start end EE85756 EE85757 EE85770 EE85775 EE85784 EE87786 EE87787 EE87788 EE87789 EE87790 EE87811 EE87812 EE87813 EE87814 EE87815 EE87893 EE87894 EE87895 EE87896 EE87897
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 chr1 1000001 2000000 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
2 chr1 3000001 4000000 3 4 3 3 3 8 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 chr1 4000001 5000000 3 4 3 3 3 13 3 3 8 3 3 3 3 3 6 3 3 3 3 3
4 chr1 5000001 6000000 3 4 3 3 3 10 3 3 7 3 3 3 3 3 3 3 3 3 3 3
5 chr1 6000001 7000000 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
6 chr1 7000001 8000000 3 4 3 3 3 7 3 3 3 3 3 3 3 3 3 3 3 3 3 3
# ℹ 1 more variable: chromid <chr>
到目前为止我所做的是,
#Joining first 2 columns for making unique row name
joined_df_sorted$chromid <- paste0(joined_df_sorted$chrom, "_", joined_df_sorted$start)
#Keeping only required columns
joined_df_sorted2 <- as.data.frame(joined_df_sorted[,c(24,4:23)])
#making first column as row name
joined_df_sorted3<-joined_df_sorted2
joined_df_sorted3X <- joined_df_sorted3[,-1]
rownames(joined_df_sorted3X) <- joined_df_sorted3[,1]
## transposing the dataframe without changing datatype
install.packages("sjmisc")
library(sjmisc)
joined_df_sorted3X_t<-joined_df_sorted3X %>%
rotate_df(cn = FALSE)
CanCohortDat<-joined_df_sorted3X_t
BileDuct <- c("EE87786", "EE87787", "EE87788", "EE87789", "EE87790")
Breast <- c("EE87811", "EE87812", "EE87813", "EE87814", "EE87815")
Gastric <- c("EE87893", "EE87894", "EE87895", "EE87896", "EE87897")
Healthy <- c("EE85756", "EE85757", "EE85770","EE85775", "EE85784")
#Separate clustering of 4 distinct datasource
h1 <- hclust(dist(CanCohortDat[BileDuct,]))
h2 <- hclust(dist(CanCohortDat[Breast,]))
h3 <- hclust(dist(CanCohortDat[Gastric,]))
h4 <- hclust(dist(CanCohortDat[Healthy,]))
#merge 4 clusters
hc <- as.hclust(merge(merge(merge(
as.dendrogram(h1), as.dendrogram(h2)), as.dendrogram(h3)),
as.dendrogram(h4)))
CanCoh <-CanCohortDat[c(BileDuct, Breast, Gastric, Healthy),]
cohort_annotation <- data.frame(Region = c(rep("BileDuct", length(BileDuct)),
rep("Breast", length(Breast)),
rep("Gastric", length(Gastric)),
rep("Healthy", length(Healthy))),
row.names = c(BileDuct, Breast, Gastric, Healthy))
#Heatmap
pheatmap(CanCoh , cluster_rows = hc,
annotation_row = cohort_annotation)
#############ERROR##############
`use_raster` is automatically set to TRUE for a matrix with more than 2000
columns You can control `use_raster` argument by explicitly setting TRUE/FALSE
to it.
Set `ht_opt$message = FALSE` to turn off this message.
'magick' package is suggested to install to give better rasterization.
Set `ht_opt$message = FALSE` to turn off this message.
Error in hclust(get_dist(t(submat), distance), method = method) :
NA/NaN/Inf in foreign function call (arg 10)
In addition: Warning message:
The input is a data frame, convert it to the matrix.
您使用的看起来像包
ComplexHeatmap
(来自错误消息)?据我所知,当您按组变量拆分时,ComplexHeamtap
可以自动合并 heamtap 树状图。详情请参阅:https://jokergoo.github.io/ComplexHeatmap-reference/book/a-single-heatmap.html#split-by-categorical-variables.
您不必自己合并树状图。
问题是你的矩阵出了问题(可能缺失值太多),一些与你类似的问题已经在github上被问到了: