创建表的聚合函数输出

问题描述 投票:0回答:1

我正在使用聚合函数来同时获取多个统计信息:

temp<-aggregate(AUClast~RIC+STUD, nca_sim[!is.na(nca_sim$RIC),], 
      FUN= function(x) 
        c(N=length(x), 
          Mean=mean(x), 
          SD=sd(x),
          Median=median(x),
          Min=min(x),
          Max=max(x)
          ))

如果我打印输出,我会看到 8 列(两行/ID 名称和 6 个统计信息)。但聚合输出仅包含 3 列(两个 ID 列和一列中的所有统计信息)。 因此,当我想为每一列分配名称时,我收到此错误消息:

colnames(temp) <-c("Renal Function", "Study", "N", "Mean", "SD", "Median", "Min", "Max")
Error in names(x) <- value : 
  'names' attribute [8] must be the same length as the vector [3]

那么,如何将 8 个名称分配给 8 列? 为了清楚起见,我需要这样做,以便我可以将它传递给“kable”以生成带有我分配的列名的格式化表。

r format output aggregate
1个回答
0
投票

这是一个解决方案。要点是最后一列是矩阵列,而不是几个向量,每个在

aggregate
中计算的统计量一个。

注意提取结果矩阵的方式是绝对通用的,这一列总是最后一列,因此它的列号总是

ncol(temp)
.

以下帖子是相关的:

[
[[

之间的区别
# make up a test data set
df1 <- mtcars[c("mpg", "cyl", "am")]
# it's not strictly needed to coerce the 
# grouping columns to factor
df1$cyl <- factor(df1$cyl)
df1$am <- factor(df1$am)

# exact same code as the question's
temp <- aggregate(mpg ~ cyl + am, df1, 
                  FUN = function(x) 
                    c(N = length(x), 
                      Mean = mean(x), 
                      SD = sd(x),
                      Median = median(x),
                      Min = min(x),
                      Max = max(x)
                    ))
# see the result
temp
#>   cyl am      mpg.N   mpg.Mean     mpg.SD mpg.Median    mpg.Min    mpg.Max
#> 1   4  0  3.0000000 22.9000000  1.4525839 22.8000000 21.5000000 24.4000000
#> 2   6  0  4.0000000 19.1250000  1.6317169 18.6500000 17.8000000 21.4000000
#> 3   8  0 12.0000000 15.0500000  2.7743959 15.2000000 10.4000000 19.2000000
#> 4   4  1  8.0000000 28.0750000  4.4838599 28.8500000 21.4000000 33.9000000
#> 5   6  1  3.0000000 20.5666667  0.7505553 21.0000000 19.7000000 21.0000000
#> 6   8  1  2.0000000 15.4000000  0.5656854 15.4000000 15.0000000 15.8000000

# the error
colnames(temp) <-c("Renal Function", "Study", "N", "Mean", "SD", "Median", "Min", "Max")
#> Error in names(x) <- value: 'names' attribute [8] must be the same length as the vector [3]

# only three columns, the last one is a matrix
# this matrix is the output of the anonymous function
# aggregate applies to the data. 
str(temp)
#> 'data.frame':    6 obs. of  3 variables:
#>  $ cyl: Factor w/ 3 levels "4","6","8": 1 2 3 1 2 3
#>  $ am : Factor w/ 2 levels "0","1": 1 1 1 2 2 2
#>  $ mpg: num [1:6, 1:6] 3 4 12 8 3 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : NULL
#>   .. ..$ : chr [1:6] "N" "Mean" "SD" "Median" ...

# three columns
(nc <- ncol(temp))
#> [1] 3

# extract the 3rd column temp[[nc]], not the data.frame temp[nc]
# and bind it with the other columns. It is also important 
# to notice that the method called is cbind.data.frame,
# since temp[-nc] extracts a data.frame. See the link above.
temp <- cbind(temp[-nc], temp[[nc]])
colnames(temp) <-c("Renal Function", "Study", "N", "Mean", "SD", "Median", "Min", "Max")

temp
#>   Renal Function Study  N     Mean        SD Median  Min  Max
#> 1              4     0  3 22.90000 1.4525839  22.80 21.5 24.4
#> 2              6     0  4 19.12500 1.6317169  18.65 17.8 21.4
#> 3              8     0 12 15.05000 2.7743959  15.20 10.4 19.2
#> 4              4     1  8 28.07500 4.4838599  28.85 21.4 33.9
#> 5              6     1  3 20.56667 0.7505553  21.00 19.7 21.0
#> 6              8     1  2 15.40000 0.5656854  15.40 15.0 15.8

创建于 2023-05-06 与 reprex v2.0.2

© www.soinside.com 2019 - 2024. All rights reserved.