在dplyr中按组连接字符串

问题描述 投票:3回答:1

我在R中有以下数据框

  Name      Weekday      Block     Count
  ABC_1       1           5B         12
  ABC_1       1           5B         12
  ABC_1       1           5C         10
  ABC_1       1           5B         10
  DER_1       2           5B         10 
  DER_1       2           5C         10 
  DER_1       2           5B         10
  DER_1       2           5C         10

我希望将数据帧作为输出

  Name      Weekday      Block           5B       5C     Cont            
  ABC_1       1           5B,5B,5C,5B    34       10     12,12,10,10
  DER_1       2           5B,5C,5B,5C    20       20     10,10,10,10

我正在使用以下代码来执行此操作。

 df_new<- df %>% 
 group_by(Weekday,Name) %>% 
 mutate(yard_blocks = paste0(Block, collapse = ",")) %>% 
 as.data.frame()

但是,它并没有给我想要的输出

r
1个回答
2
投票

按“名称”,“工作日”和“阻止”分组后,将频率作为列('n'),然后通过分组'名称','工作日',我们mutatepaste'阻止'的内容在新栏目'Block1'中,获取从'long'到'wide'的唯一行(distinct)和spread

library(dplyr)
library(tidyr)
df %>%
  group_by(Name, Weekday, Block) %>%
  mutate(n = n()) %>%
  group_by(Name, Weekday) %>% 
  mutate(Block1 = toString(Block)) %>%
  distinct %>% 
  spread(Block, n) %>%
  rename(Block = Block1)
# A tibble: 2 x 5
# Groups: Name, Weekday [2]
#    Name  Weekday Block           `5B`  `5C`
#* <chr>   <int> <chr>          <int> <int>
#1 ABC_1       1 5B, 5B, 5C, 5B     3     1
#2 DER_1       2 5B, 5C, 5B, 5C     2     2

Update

基于更新的数据集和问题

df %>%
    group_by(Name, Weekday) %>%
    mutate(Block1 = toString(Block), Cont = toString(Count)) %>% 
    group_by(Block, add = TRUE) %>% 
    mutate(Count = sum(Count)) %>% 
    distinct  %>% 
    spread(Block, Count)
# A tibble: 2 x 6
# Groups: Name, Weekday [2]
#   Name  Weekday Block1         Cont            `5B`  `5C`
#*  <chr>   <int> <chr>          <chr>          <int> <int>
#1  ABC_1       1 5B, 5B, 5C, 5B 12, 12, 10, 10    34    10
#2  DER_1       2 5B, 5C, 5B, 5C 10, 10, 10, 10    20    20
© www.soinside.com 2019 - 2024. All rights reserved.