这是我关于 stack noveflow 的第一个问题。我尝试过一些研究,但还没有达到目的。我正在尝试计算每个主题的两个不同均值,它们由 R 中的变量区分。 我创建了这个示例数据集:
airquality <- data.frame(subject = c("CityA", "CityA","CityA","CityA", "CityA", "CityA", "CityA", "CityA",
"CityB","CityB","CityB", "CityB", "CityB", "CityB", "CityB", "CityB",
"CityC", "CityC", "CityC", "CityC", "CityC", "CityC", "CityC", "CityC"),
code = c("AA_high", "BB_high", "AA_low", "BB_low", "AB_high", "AB_low", "BA_high", "BA_low",
"AA_high", "BB_high", "AA_low", "BB_low", "AB_high", "AB_low", "BA_high", "BA_low",
"AA_high", "BB_high", "AA_low", "BB_low", "AB_high", "AB_low", "BA_high", "BA_low"),
measure = c("2", "6", "5", "3", "5", "5", "4", "6",
"7", "8", "6", "7", "4", "12", "9", "7",
"8", "12", "11", "9", "15", "11", "10", "16"))
print(airquality)
目标是每个主题有 2 个平均值:第一个平均值是所有代码以“AA”或“BB”开头的度量的平均值。 每个科目的第二个平均值应该是所有以“AB”或“BA”开头的代码的测量平均值。
我从这个开始:
df1 <- filter(airquality, substr(code, 1, 2) == "AA" | substr(code, 1,2) == "BB")
df1 <- df1 %>%
group_by(subject) %>%
mutate(mean_AABB = mean(measure))
View(df1)
df2 <- filter(airquality, substr(code, 1, 2) == "AB" | substr(code, 1,2) == "BA")
df2 <- df2 %>%
group_by(subject) %>%
mutate(mean_ABBA = mean(measure))
View(df2)
但这似乎不起作用,我需要将列添加到数据框中,这样我就可以用这些方法进行更多计算。 因此,下一步是将这些类型的方法添加到数据框中。
我非常感谢您的帮助! 非常感谢, 麦可
关于:
library(dplyr)
airquality |>
group_by(first_two_equal = substr(code, 1, 1) == substr(code, 2, 2)) |>
mutate(measure = as.numeric(measure), ## !
mean = mean(measure)
)
(请注意示例数据的
measure
模式为“字符”,因此您需要将其转换为“数字”)