我正在尝试编写一段代码,允许我在组中创建
TRUE
或FALSE
变量name
,具体取决于以下data.frame
弹出的列的最早记录的值:
library(tidyverse)
name<-c("AAA","AAA","AAA","AAA","AAA","AAA","AAA")
poped<-c(NA,1,NA,NA,1,NA,NA)
order<-c(1:7)
tag<-c("X","Y","X","X","Y","X","X")
> df
name order tag poped
1 AAA 1 X NA
2 AAA 2 Y 1
3 AAA 3 X NA
4 AAA 4 X NA
5 AAA 5 Y 1
6 AAA 6 X NA
7 AAA 7 X NA
我想改变两个名为
CHECK
和 POS
的新变量
CHECK
将采用这些值
1= If the closest (above) value where the tag column is Y and poped is 1
0= If the closest (above) value where the tag column is Y and poped is 0
2 = If the current row has tag = Y
NA = Otherwise
POS
将采用最接近(上方)行号的值,其中标签列为 Y 并且 poped 为 1,否则为 NA
。
我想要的输出是:
> df
name order tag poped CHECK POS why
1 AAA 1 X NA NA NA There is no previous data
2 AAA 2 Y 1 NA NA current tag = Y
3 AAA 3 X NA 1 2 the closest value above where tag=Y is in row 2 and poped is 1
4 AAA 4 X NA 1 2 the closest value above where tag=Y is in row 2 and poped is 1
5 AAA 5 Y 1 NA NA current tag = Y
6 AAA 6 X NA 1 5 the closest value above where tag=Y is in row 5 and poped is 1
7 AAA 7 X NA 1 5 the closest value above where tag=Y is in row 5 and poped is 1
如何创建解决方案,最好使用 Tidyverse?
df %>%
mutate(ctag=if_else(tag=="Y",tag,as.character(NA)),
cpop=if_else(tag=="Y",poped,as.double(NA)),
maxr=if_else(tag=="Y" & poped==1,order,as.integer(NA))) %>%
fill(ctag,cpop,maxr) %>%
mutate(
CHECK = case_when(
tag == "Y"~2,
lag(ctag) == "Y" & lag(cpop)==1 ~1,
lag(ctag) == "Y" & lag(cpop)==0 ~0,
TRUE~as.double(NA)),
POS = if_else(tag=="Y", as.integer(NA), maxr)
) %>%
select(!ctag:maxr)
输出:
name order tag poped CHECK POS
<chr> <int> <chr> <dbl> <dbl> <int>
1 AAA 1 X NA NA NA
2 AAA 2 Y 1 2 NA
3 AAA 3 X NA 1 2
4 AAA 4 X NA 1 2
5 AAA 5 Y 1 2 NA
6 AAA 6 X NA 1 5
7 AAA 7 X NA 1 5