我正在研究聚类问题,我想使用 hclust 函数创建树状图,并使用 cutreeDynamic 从上述树状图创建聚类。事实上,我已经做到了。
# Preprocessing data for only numeric features
omicData_clustering <- omicData
omicData_clustering[[classVariable]] <- clinicDataSVM[[classVariable]]
omicData_clustering <- omicData_clustering[omicData_clustering[[classVariable]] %in% c(changedClass, class), ]
omicData_clustering <- omicData_clustering[,
-which(names(omicData_clustering) %in% c(idColumn))]
omicData_num <- omicData_clustering[,
-which(names(omicData_clustering) %in% c(classVariable))]
# scale the data
omicData_clustering_scaled <- scale(omicData_num)
# getting dist
dist <- dist(omicData_clustering_scaled)
# doing hclust
hc <- hclust(dist, method = "complete")
# number of changed class for the minimum cluster size
num <- sum(clinicDataSVM[[classVariable]] == changedClass)
# getting dynamic clusters
dynamic_clusters <- cutreeDynamic(hc, distM = as.matrix(dist), minClusterSize = num)
# getting only changed class labels position
labels <- omicData_clustering[[classVariable]]
labels[labels != changedClass] <- ""
其中“dynamic_clusters”具有以下值,例如:
> dynamic_clusters
7 1 4 1 1 3 7 2 4 6 1 1 2 3 3 2 1 2 6 1 1 3 1 2 2 7 1 6 7 1 1 2 1 6 3 7 1 2 7 1 5 2 6 6 7 2 6 6 5 7 3 1 6 5 1 2 2 6 2 1 6 7 4 6 2 1 4 1 6 5 4 4 7 1
4 1 5 1 1 6 4 2 5 3 1 1 2 6 6 2 1 2 3 1 1 6 1 2 2 4 1 3 4 1 1 2 1 3 6 4 1 2 4 1 7 2 3 3 4 2 3 3 7 4 6 1 3 7 1 2 2 3 2 1 3 4 5 3 2 1 5 1 3 7 5 5 4 1
4 7 2 2 1 1 5 1 6 3 4 6 7 5 2 7 5 6 5 1 4 4 7 3 5 2 4 2 6 2 7 1 1 1 2 7 2 2 6 7 6 3 6 7 1 5 2 7 4 2 1 3 7 6 1 4 6 2 2 5 7 3 7 2 7 2 6 1 6 6 1 6 1 1
5 4 2 2 1 1 7 1 3 6 5 3 4 7 2 4 7 3 7 1 5 5 4 6 7 2 5 2 3 2 4 1 1 1 2 4 2 2 3 4 3 6 3 4 1 7 2 4 5 2 1 6 4 3 1 5 3 2 2 7 4 6 4 2 4 2 3 1 3 3 1 3 1 1
3 7 6 4 7 4 2 2 7 7 7 4 4 5 2 3 4 1 2 4 1 1 3 6 2 6 2
6 4 3 5 4 5 2 2 4 4 4 5 5 7 2 6 5 1 2 5 1 1 6 3 2 3 2
在标签中,我有以下内容:
> labels
[1] "" "" "" "" "" "" "Control2Case" "" ""
[10] "" "" "" "Control2Case" "" "" "" "" ""
[19] "" "" "" "" "" "" "" "" ""
[28] "" "" "" "Control2Case" "" "" "" "" ""
[37] "" "" "" "" "Control2Case" "" "" "Control2Case" ""
[46] "" "" "" "Control2Case" "" "" "" "" ""
[55] "" "" "" "" "" "" "" "Control2Case" ""
[64] "" "" "" "" "" "" "" "" ""
[73] "" "" "" "" "" "" "" "" ""
[82] "" "" "" "" "" "" "" "" ""
[91] "" "" "" "" "" "" "" "" ""
[100] "" "" "" "" "" "" "" "" ""
[109] "" "" "" "" "" "" "Control2Case" "" ""
[118] "" "" "" "" "" "" "" "" ""
[127] "" "" "" "" "" "" "" "" "Control2Case"
[136] "" "" "" "" "" "" "" "" ""
[145] "" "" "" "" "" "" "" "" ""
[154] "" "" "" "" "" "" "" "" ""
[163] "" "" "" "" "" "" "" "" ""
[172] "" "" "" ""
问题是我想用聚类绘制树状图并确定“Control2Case”属于哪些聚类。这可能吗?
我输入了以下代码(来自https://cran.r-project.org/web/packages/dendextend/vignettes/dendextend.html):
library(dynamicTreeCut)
data(iris)
x <- iris[,-5] %>% as.matrix
hc <- x %>% dist %>% hclust
dend <- hc %>% as.dendrogram
# Find special clusters:
clusters <- cutreeDynamic(hc, distM = as.matrix(dist(x)), method = "tree")
# we need to sort them to the order of the dendrogram:
clusters <- clusters[order.dendrogram(dend)]
clusters_numbers <- unique(clusters) - (0 %in% clusters)
n_clusters <- length(clusters_numbers)
library(colorspace)
cols <- rainbow_hcl(n_clusters)
true_species_cols <- rainbow_hcl(3)[as.numeric(iris[,][order.dendrogram(dend),5])]
dend2 <- dend %>%
branches_attr_by_clusters(clusters, values = cols) %>%
color_labels(col = true_species_cols)
plot(dend2)
clusters <- factor(clusters)
levels(clusters)[-1] <- cols[-5][c(1,4,2,3)]
# Get the clusters to have proper colors.
# fix the order of the colors to match the branches.
colored_bars(clusters, dend, sort_by_labels_order = FALSE)
但我不知道如何使其适应我的具体问题,因为之前的代码中有一些特定于 Iris 问题的行,我不明白它们为什么在那里。
在ggalign的开发版本中,我引入了一个新的
cutree
参数,允许用户应用任何自定义函数来进行树木切割。只需将 iris
数据替换为您的数据即可。该对象是一个类似ggplot的对象,您可以通过映射为分支着色。
library(ggalign)
#> Loading required package: ggplot2
ggstack(iris[, -5L], "v") +
align_dendro(
aes(color = branch),
cutree = function(tree, dist, k, h) {
dynamicTreeCut::cutreeDynamic(tree, distM = dist, method = "tree")
}
) +
scale_y_continuous(expand = expansion()) +
scale_color_brewer(palette = "Dark2") +
theme(axis.text.x = element_text(angle = -90, hjust = 0))
创建于 2024 年 10 月 13 日,使用 reprex v2.1.0
〜
〜