我用这种方式计算一个简短的汇总表:
library(data.table)
library(tidyverse)
mtcars.dt <- data.table(mtcars)
mtcars_grby <- mtcars.dt %>%
# group by cyl
group_by(cyl) %>%
# summarise
summarise(
# number observ
quantity = n(),
# average
mpg_avg = mean(mpg, na.rm = TRUE),
# min and max weight
wt_min = min(wt),
wt_max = max(wt))
mtcars_grby
# A tibble: 3 x 5
cyl quantity mpg_avg wt_min wt_max
<dbl> <int> <dbl> <dbl> <dbl>
1 4 11 26.7 1.51 3.19
2 6 7 19.7 2.62 3.46
3 8 14 15.1 3.17 5.42
仅使用 data.table 执行此操作的最短方法是什么?我只能想象合并四个不同的聚合,但这确实不优雅。
像这样使用 .(...)、keyby= 和 .N:
library(data.table)
mtcars.dt <- data.table(mtcars)
mtcars.dt[, .(quantity = .N,
mpg_avg = mean(mpg, na.rm = TRUE),
wt_min = min(wt),
wt_max = max(wt)),
keyby = cyl]
给予
Key: <cyl>
cyl quantity mpg_avg wt_min wt_max
<num> <int> <num> <num> <num>
1: 4 11 26.66364 1.513 3.190
2: 6 7 19.74286 2.620 3.460
3: 8 14 15.10000 3.170 5.424