我在R中有以下数据框
Name Weekday Block Count
ABC_1 1 5B 12
ABC_1 1 5B 12
ABC_1 1 5C 10
ABC_1 1 5B 10
DER_1 2 5B 10
DER_1 2 5C 10
DER_1 2 5B 10
DER_1 2 5C 10
我希望将数据帧作为输出
Name Weekday Block 5B 5C Cont
ABC_1 1 5B,5B,5C,5B 34 10 12,12,10,10
DER_1 2 5B,5C,5B,5C 20 20 10,10,10,10
我正在使用以下代码来执行此操作。
df_new<- df %>%
group_by(Weekday,Name) %>%
mutate(yard_blocks = paste0(Block, collapse = ",")) %>%
as.data.frame()
但是,它并没有给我想要的输出
按“名称”,“工作日”和“阻止”分组后,将频率作为列('n'),然后通过分组'名称','工作日',我们mutate
到paste
'阻止'的内容在新栏目'Block1'中,获取从'long'到'wide'的唯一行(distinct
)和spread
library(dplyr)
library(tidyr)
df %>%
group_by(Name, Weekday, Block) %>%
mutate(n = n()) %>%
group_by(Name, Weekday) %>%
mutate(Block1 = toString(Block)) %>%
distinct %>%
spread(Block, n) %>%
rename(Block = Block1)
# A tibble: 2 x 5
# Groups: Name, Weekday [2]
# Name Weekday Block `5B` `5C`
#* <chr> <int> <chr> <int> <int>
#1 ABC_1 1 5B, 5B, 5C, 5B 3 1
#2 DER_1 2 5B, 5C, 5B, 5C 2 2
基于更新的数据集和问题
df %>%
group_by(Name, Weekday) %>%
mutate(Block1 = toString(Block), Cont = toString(Count)) %>%
group_by(Block, add = TRUE) %>%
mutate(Count = sum(Count)) %>%
distinct %>%
spread(Block, Count)
# A tibble: 2 x 6
# Groups: Name, Weekday [2]
# Name Weekday Block1 Cont `5B` `5C`
#* <chr> <int> <chr> <chr> <int> <int>
#1 ABC_1 1 5B, 5B, 5C, 5B 12, 12, 10, 10 34 10
#2 DER_1 2 5B, 5C, 5B, 5C 10, 10, 10, 10 20 20