我正在使用聚合函数来同时获取多个统计信息:
temp<-aggregate(AUClast~RIC+STUD, nca_sim[!is.na(nca_sim$RIC),],
FUN= function(x)
c(N=length(x),
Mean=mean(x),
SD=sd(x),
Median=median(x),
Min=min(x),
Max=max(x)
))
如果我打印输出,我会看到 8 列(两行/ID 名称和 6 个统计信息)。但聚合输出仅包含 3 列(两个 ID 列和一列中的所有统计信息)。 因此,当我想为每一列分配名称时,我收到此错误消息:
colnames(temp) <-c("Renal Function", "Study", "N", "Mean", "SD", "Median", "Min", "Max")
Error in names(x) <- value :
'names' attribute [8] must be the same length as the vector [3]
那么,如何将 8 个名称分配给 8 列? 为了清楚起见,我需要这样做,以便我可以将它传递给“kable”以生成带有我分配的列名的格式化表。
这是一个解决方案。要点是最后一列是矩阵列,而不是几个向量,每个在
aggregate
中计算的统计量一个。
注意提取结果矩阵的方式是绝对通用的,这一列总是最后一列,因此它的列号总是
ncol(temp)
.
[
和[[
之间的区别
# make up a test data set
df1 <- mtcars[c("mpg", "cyl", "am")]
# it's not strictly needed to coerce the
# grouping columns to factor
df1$cyl <- factor(df1$cyl)
df1$am <- factor(df1$am)
# exact same code as the question's
temp <- aggregate(mpg ~ cyl + am, df1,
FUN = function(x)
c(N = length(x),
Mean = mean(x),
SD = sd(x),
Median = median(x),
Min = min(x),
Max = max(x)
))
# see the result
temp
#> cyl am mpg.N mpg.Mean mpg.SD mpg.Median mpg.Min mpg.Max
#> 1 4 0 3.0000000 22.9000000 1.4525839 22.8000000 21.5000000 24.4000000
#> 2 6 0 4.0000000 19.1250000 1.6317169 18.6500000 17.8000000 21.4000000
#> 3 8 0 12.0000000 15.0500000 2.7743959 15.2000000 10.4000000 19.2000000
#> 4 4 1 8.0000000 28.0750000 4.4838599 28.8500000 21.4000000 33.9000000
#> 5 6 1 3.0000000 20.5666667 0.7505553 21.0000000 19.7000000 21.0000000
#> 6 8 1 2.0000000 15.4000000 0.5656854 15.4000000 15.0000000 15.8000000
# the error
colnames(temp) <-c("Renal Function", "Study", "N", "Mean", "SD", "Median", "Min", "Max")
#> Error in names(x) <- value: 'names' attribute [8] must be the same length as the vector [3]
# only three columns, the last one is a matrix
# this matrix is the output of the anonymous function
# aggregate applies to the data.
str(temp)
#> 'data.frame': 6 obs. of 3 variables:
#> $ cyl: Factor w/ 3 levels "4","6","8": 1 2 3 1 2 3
#> $ am : Factor w/ 2 levels "0","1": 1 1 1 2 2 2
#> $ mpg: num [1:6, 1:6] 3 4 12 8 3 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : NULL
#> .. ..$ : chr [1:6] "N" "Mean" "SD" "Median" ...
# three columns
(nc <- ncol(temp))
#> [1] 3
# extract the 3rd column temp[[nc]], not the data.frame temp[nc]
# and bind it with the other columns. It is also important
# to notice that the method called is cbind.data.frame,
# since temp[-nc] extracts a data.frame. See the link above.
temp <- cbind(temp[-nc], temp[[nc]])
colnames(temp) <-c("Renal Function", "Study", "N", "Mean", "SD", "Median", "Min", "Max")
temp
#> Renal Function Study N Mean SD Median Min Max
#> 1 4 0 3 22.90000 1.4525839 22.80 21.5 24.4
#> 2 6 0 4 19.12500 1.6317169 18.65 17.8 21.4
#> 3 8 0 12 15.05000 2.7743959 15.20 10.4 19.2
#> 4 4 1 8 28.07500 4.4838599 28.85 21.4 33.9
#> 5 6 1 3 20.56667 0.7505553 21.00 19.7 21.0
#> 6 8 1 2 15.40000 0.5656854 15.40 15.0 15.8
创建于 2023-05-06 与 reprex v2.0.2