我有一个DF按日期列出ID,如下所示:
Date Ben James
12/10/17 1294 NA
12/11/17 NA 4523
12/12/17 8959 3246
12/13/17 2345 NA
12/14/17 NA NA
12/15/17 0303 8877
12/16/17 NA 1427
“name”列的数量是可变的,所以在另一天我可能有一个如下所示的DF:
Date Ben James Alex
12/10/17 1294 NA 3754
12/11/17 NA 4523 1122
12/12/17 8959 3246 5582
12/13/17 2345 NA NA
12/14/17 NA NA 0094
12/15/17 0303 8877 NA
12/16/17 NA 1427 NA
我想将每个名称列的3个最新ID放入一个新的数据框中,如下所示:
IDs
8959
2345
0303
3246
8877
1427
1122
5582
0094
我只需要新DF中的ID。我不在乎用名字或日期来标记它们。
c(sapply(df[-1], function(x) sprintf("%04d", tail(x[!is.na(x)], 3))))
#[1] "8959" "2345" "0303" "3246" "8877" "1427" "1122" "5582" "0094"
数据
df = structure(list(Date = c("12/10/17", "12/11/17", "12/12/17", "12/13/17",
"12/14/17", "12/15/17", "12/16/17"), Ben = c(1294L, NA, 8959L,
2345L, NA, 303L, NA), James = c(NA, 4523L, 3246L, NA, NA, 8877L,
1427L), Alex = c(3754L, 1122L, 5582L, NA, 94L, NA, NA)), .Names = c("Date",
"Ben", "James", "Alex"), class = "data.frame", row.names = c(NA,
-7L))
res <- do.call(rbind,
apply(df[, -1], 2, function(x) data.frame(IDs = tail(na.omit(x), 3))))
这是一个使用tidyverse
的选项
library(tidyverse)
df %>%
summarise_at(vars(-one_of('Date')), funs(list(tail(.[!is.na(.)], 3)))) %>%
unlist(., use.names = FALSE) %>%
str_pad(width = 4, pad=0)
#[1] "8959" "2345" "0303" "3246" "8877" "1427" "1122" "5582" "0094"