R 索引,过滤然后根据多个条件进行匹配。在 Excel 中很容易,在 R 中则不然

问题描述 投票:0回答:1

我想提取每个变量和 weed_type 的“未经处理的控制”值。在 Excel 中,很容易查找过滤器行中的值并返回该杂草类型的“untreatment_contol”结果。然后我想计算控制百分比(即(未处理 - 预测值)/未处理 * 100)。 因为我有大量的“weed_types”,有什么建议可以简化这个过程吗?


df <- data.frame(Weed_type = c("weed1", "weed1", "weed1", "weed2", "weed2", "weed2", "weed3", "weed3", "weed3"),
                 Treatment = c("untreated control", "Treatment1", "Treatment2", "untreated control", "Treatment1", "Treatment2", "untreated control", "Treatment1", "Treatment2"),
                 predicted.value = c(23.3, 0.4, 0,  .9, .15, .01, 87, 12,2)
)
df

r 新手,使用了如下冗长的解决方法:

提取每种杂草类型的未处理值

weed1_c <- df %>% filter(Weed_type == 'weed1' & Treatment == "untreated control")  %>% pull(predicted.value)
weed2_c <- df %>% filter(Weed_type == 'weed2' & Treatment == "untreated control")  %>% pull(predicted.value)
weed3_c <- df %>% filter(Weed_type == 'weed3' & Treatment == "untreated control")  %>% pull(predicted.value)

将这些值添加回数据框中

df <- df %>% 
      mutate(untreated = case_when(
        Weed_type=="weed1" ~   weed1_c,
        Weed_type=="weed2" ~   weed2_c,
        Weed_type=="weed3" ~   weed3_c,
      ))

计算百分比控制

df$percentage_control <- (df$untreated  - df$predicted.value) / df$untreated *100
df  

希望有一种更简单的方法来达到这个结果!

r dplyr filter match mutate
1个回答
0
投票

这里有两种可能的选项可以实现您想要的结果,而无需将每种杂草类型的未处理值存储在单独的变量中:

library(dplyr, warn.conflicts = FALSE)

df |> 
  mutate(
    untreated = predicted.value[Treatment == "untreated control"],
    .by = Weed_type
  ) |> 
  mutate(
    percentage_control = (untreated  - predicted.value) / untreated * 100
  )
#>   Weed_type         Treatment predicted.value untreated percentage_control
#> 1     weed1 untreated control           23.30      23.3            0.00000
#> 2     weed1        Treatment1            0.40      23.3           98.28326
#> 3     weed1        Treatment2            0.00      23.3          100.00000
#> 4     weed2 untreated control            0.90       0.9            0.00000
#> 5     weed2        Treatment1            0.15       0.9           83.33333
#> 6     weed2        Treatment2            0.01       0.9           98.88889
#> 7     weed3 untreated control           87.00      87.0            0.00000
#> 8     weed3        Treatment1           12.00      87.0           86.20690
#> 9     weed3        Treatment2            2.00      87.0           97.70115


df |> 
  filter(
    Treatment == "untreated control"
  ) |> 
  select(Weed_type, untreated = predicted.value) |> 
  right_join(x = df, by = "Weed_type") |> 
  mutate(
    percentage_control = (untreated  - predicted.value) / untreated * 100
  )
#>   Weed_type         Treatment predicted.value untreated percentage_control
#> 1     weed1 untreated control           23.30      23.3            0.00000
#> 2     weed1        Treatment1            0.40      23.3           98.28326
#> 3     weed1        Treatment2            0.00      23.3          100.00000
#> 4     weed2 untreated control            0.90       0.9            0.00000
#> 5     weed2        Treatment1            0.15       0.9           83.33333
#> 6     weed2        Treatment2            0.01       0.9           98.88889
#> 7     weed3 untreated control           87.00      87.0            0.00000
#> 8     weed3        Treatment1           12.00      87.0           86.20690
#> 9     weed3        Treatment2            2.00      87.0           97.70115
最新问题
© www.soinside.com 2019 - 2025. All rights reserved.