在R中输入k-means

Question

我正在尝试在69列和1000行的数据帧上执行k-means。首先，我需要首先使用Davies-Bouldin指数来确定最佳簇数。这个算法要求输入应该是矩阵的形式，我首先使用这个代码：

totalm <- data.matrix(total)

接下来是以下代码（Davies-Bouldin指数）

clusternumber<-0
max_cluster_number <- 30
#Davies Bouldin algorithm
library(clusterCrit)
smallest <-99999
for(b in 2:max_cluster_number){
a <-99999
for(i in 1:200){
cl <- kmeans(totalm,b)
cl<-as.numeric(cl)
intCriteria(totalm,cl$cluster,c("dav"))
if(intCriteria(totalm,cl$cluster,c("dav"))$davies_bouldin < a){
a <- intCriteria(totalm,cl$cluster,c("dav"))$davies_bouldin }
}
if(a<smallest){
smallest <- a
clusternumber <-b
}
}
print("##clusternumber##")
print(clusternumber)
print("##smallest##")
print(smallest)

我继续得到这个错误:( list）对象无法强制输入'double'。我怎么解决这个问题？

可重复的例子：

a <- c(0,0,1,0,1,0,0)
b <- c(0,0,1,0,0,0,0)
c <- c(1,1,0,0,0,0,1)
d <- c(1,1,0,0,0,0,0)

total <- cbind(a,b,c,d)

Answer 1

错误来自cl<-as.numeric(cl)。调用kmeans的结果是一个对象，它是一个包含有关模型的各种信息的列表。

运行?kmeans

我还建议你将nstart = 20添加到你的kmeans电话中。 k均值聚类是一个随机过程。这将运行算法20次并找到最佳拟合（即对于每个中心数）。

for(b in 2:max_cluster_number){
    a <-99999
    for(i in 1:200){
        cl <- kmeans(totalm,centers = b,nstart = 20)
        #cl<-as.numeric(cl)
        intCriteria(totalm,cl$cluster,c("dav"))
        if(intCriteria(totalm,cl$cluster,c("dav"))$davies_bouldin < a){
            a <- intCriteria(totalm,cl$cluster,c("dav"))$davies_bouldin }
    }
    if(a<smallest){
        smallest <- a
        clusternumber <-b
    }
}

这给了我

[1] "##clusternumber##"   
[1] 4
[1] "##smallest##"
[1] 0.138675

（暂时将最大簇更改为4，因为可重现的数据是一小组）

编辑整数错误

我能够使用重现您的错误

a <- as.integer(c(0,0,1,0,1,0,0))
b <- as.integer(c(0,0,1,0,0,0,0))
c <- as.integer(c(1,1,0,0,0,0,1))
d <- as.integer(c(1,1,0,0,0,0,0))

totalm <- cbind(a,b,c,d)

这样就创建了一个整数矩阵。

然后，我可以通过使用删除错误

storage.mode(totalm) <- "double"

注意

total <- cbind(a,b,c,d)
totalm <- data.matrix(total)

这个例子中的数据是不必要的

> identical(total,totalm)
[1] TRUE

在R中输入k-means

问题描述投票：0回答：1

1个回答

最新问题

在R中输入k-means

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1