使用data.table [duplicate]选择组中的x最高值

问题描述 投票:0回答:1

这个问题在这里已有答案:

如何在data.table中为每个组选择x最高值?

例如,我想为每个组(日期)取两个最高值(Val)。所以对于这个数据集:

Date    Name    Val
01/01/2010  A   3
01/01/2010  B   2
01/01/2010  C   1
02/01/2010  A   4
02/01/2010  B   2
02/01/2010  C   3
02/01/2010  D   1

代码应该返回:

Date    Name    Val
01/01/2010  A   3
01/01/2010  B   2
02/01/2010  A   4
02/01/2010  C   3
r data.table
1个回答
1
投票
df <- read.table(text = "Date    Name    Val
01/01/2010  A   3
                 01/01/2010  B   2
                 01/01/2010  C   1
                 02/01/2010  A   4
                 02/01/2010  B   2
                 02/01/2010  C   3
                 02/01/2010  D   1", 
                 header = TRUE, stringsAsFactors = FALSE)

setDT(df)
df[, max_val := max(Val), by = Date]
df[, max_sec := order(Val, decreasing = T)[2], by = Date]
df <- df[Val == max_val | Val == max_sec, ]
df[, c("max_val", "max_sec") := NULL]

         Date Name Val
1: 01/01/2010    A   3
2: 01/01/2010    B   2
3: 02/01/2010    A   4
4: 02/01/2010    C   3
© www.soinside.com 2019 - 2024. All rights reserved.