我有一个数据框
df1
df1<- setNames(data.frame(matrix(ncol = 3, nrow = 37)), c("material","condition", "pID")) df1$material <- c("FBZOIKS","FBZOIKS","FBZOIKS","FBZOIKS","VNTYALQ","VNTYALQ","VNTYALQ","HMRCJXU","HMRCJXU","HMRCJXU","HMRCJXU","HMRCJXU","CURHJXM","UXJMRCH","UXJMRCH","XMRCUJH","XMRCUJH","XMRCUJH","FBZOIKS","FBZOIKS", "FBZOIKS","FBZOIKS","VNTYALQ","VNTYALQ","VNTYALQ","VNTYALQ","HMRCJXU","HMRCJXU","HMRCJXU","HMRCJXU","CURHJXM","CURHJXM","UXJMRCH","UXJMRCH","XMRCUJH","XMRCUJH","XMRCUJH") df1$condition <- c("false"," "," "," "," "," "," "," "," "," "," ","","false"," "," "," "," ",""," false"," ", " "," "," "," "," "," "," "," "," "," "," false"," "," "," "," "," ","") df1$pID <- c("p1"," p1"," p1"," p1"," p1"," p1"," p1"," p1"," p1"," p1"," p1"," p1"," p1"," p1"," p1"," p1"," p1","p1"," p2"," p2", " p2"," p2"," p2"," p2"," p2"," p2"," p2"," p2"," p2"," p2"," p2"," p2"," p2"," p2"," p2"," p2","p2")
我需要创建两列,按 pID 将它们分组:
block
和 Nletters_block
。
对于块,我需要识别 condition
列中的第一个“假”,并给出值 1,直到识别出该 pID 的下一个“假”。当识别出下一个时,我需要分配一个值 2。如果识别出下一个,我需要分配一个值 3,依此类推。
对于 Nletters_block,我需要计算每个参与者和区块中嵌入列中的唯一字母数量
material
。
如果我可以使用
dplyr
库,我会更好。
以下是我想要获得的:
material condition pID block Nletters_block
FBZOIKS false p1 1 21
FBZOIKS p1 1 21
FBZOIKS p1 1 21
FBZOIKS p1 1 21
VNTYALQ p1 1 21
VNTYALQ p1 1 21
VNTYALQ p1 1 21
HMRCJXU p1 1 21
HMRCJXU p1 1 21
HMRCJXU p1 1 21
HMRCJXU p1 1 21
HMRCJXU p1 1 21
CURHJXM false p1 2 7
UXJMRCH p1 2 7
UXJMRCH p1 2 7
XMRCUJH p1 2 7
XMRCUJH p1 2 7
XMRCUJH p1 2 7
FBZOIKS false p2 1 21
FBZOIKS p2 1 21
FBZOIKS p2 1 21
FBZOIKS p2 1 21
VNTYALQ p2 1 21
VNTYALQ p2 1 21
VNTYALQ p2 1 21
VNTYALQ p2 1 21
HMRCJXU p2 1 21
HMRCJXU p2 1 21
HMRCJXU p2 1 21
HMRCJXU p2 1 21
CURHJXM false p2 2 7
CURHJXM p2 2 7
UXJMRCH p2 2 7
UXJMRCH p2 2 7
XMRCUJH p2 2 7
XMRCUJH p2 2 7
XMRCUJH p2 2 7
首先按 pID 分组,然后按 pID 和 block 分组。通过粘贴所有材料并随后分割来计算字符数。
df1 %>%
mutate(block = cumsum(condition == "false"), .by = pID) %>%
mutate(Nletters_block = length(unique(unlist(strsplit(
paste(material, collapse=""), "")))), .by = c(pID, block))
material condition pID block Nletters_block
1 FBZOIKS false p1 1 21
2 FBZOIKS p1 1 21
3 FBZOIKS p1 1 21
4 FBZOIKS p1 1 21
5 VNTYALQ p1 1 21
6 VNTYALQ p1 1 21
7 VNTYALQ p1 1 21
8 HMRCJXU p1 1 21
9 HMRCJXU p1 1 21
10 HMRCJXU p1 1 21
11 HMRCJXU p1 1 21
12 HMRCJXU p1 1 21
13 CURHJXM false p1 2 7
14 UXJMRCH p1 2 7
15 UXJMRCH p1 2 7
16 XMRCUJH p1 2 7
17 XMRCUJH p1 2 7
18 XMRCUJH p1 2 7
19 FBZOIKS false p2 1 21
20 FBZOIKS p2 1 21
21 FBZOIKS p2 1 21
22 FBZOIKS p2 1 21
23 VNTYALQ p2 1 21
24 VNTYALQ p2 1 21
25 VNTYALQ p2 1 21
26 VNTYALQ p2 1 21
27 HMRCJXU p2 1 21
28 HMRCJXU p2 1 21
29 HMRCJXU p2 1 21
30 HMRCJXU p2 1 21
31 CURHJXM false p2 2 7
32 CURHJXM p2 2 7
33 UXJMRCH p2 2 7
34 UXJMRCH p2 2 7
35 XMRCUJH p2 2 7
36 XMRCUJH p2 2 7
37 XMRCUJH p2 2 7