我有一个这样的Dataframe:
teammember <- c('Member A', 'Member B', 'Member C')
value_a <- c('success', 'fail', NA)
value_b <- c('fail', NA, 'success')
value_c <- c('success', NA, 'fail')
data_df <- data.frame(teammember, value_a, value_b, value_c)
现在我想计算团队成员分组的每个“成功”。我的想法是这样的:
data_df %>%
group_by(teammember) %>%
filter(value_a == "success" | value_b == "success" | value_c == "success") %>%
summarise(sales = length(value_a) , length(value_b) , length(value_c)) %>%
select(teammember, sales)
我的结果看起来像这样:
# A tibble: 2 x 2
teammember sales
<fct> <int>
1 Member A 1
2 Member C 1
但它应该是这样的:
# A tibble: 2 x 2
teammember sales
<fct> <int>
1 Member A 2
2 Member C 1
你能告诉我正确的解决方案应该是什么样的吗? :)
在此先感谢您的帮助。
康斯坦丁
你用success
计算每列的summarize
,而你想要的是每行中successs
的数量。你可以尝试rowSums
:
res <- data.frame(
teammember = data_df$teammember,
sales = rowSums(data_df[, paste0('value_', letters[1:3])] == 'success', na.rm = T)
)
# teammember sales
# 1 Member A 2
# 2 Member B 0
# 3 Member C 1
可以使用res <- res[res$sales > 0, ]
删除零值的行。
一种选择是在开始时使用filter_at
,然后gather
将'value_'列变为'long'格式,filter
the'val有“成功”字符串并获得count
library(dplyr)
library(tidyr)
data_df %>%
filter_at(vars(matches("value")), any_vars(. %in% 'success')) %>%
gather(var, val, value_a:value_c, na.rm = TRUE) %>%
filter(val =='success') %>%
count(teammember)
# A tibble: 2 x 2
# teammember n
# <fctr> <int>
#1 Member A 2
#2 Member C 1
或者另一个选择是做nest
ing然后用map
我们得到计数
library(purrr)
data_df %>%
nest(-teammember) %>%
transmute(teammember, sales = map(data, ~ sum(unlist(.x) == "success", na.rm = TRUE))) %>%
filter(sales != 0)
# teammember sales
#1 Member A 2
#2 Member C 1