我在 R 中有一个 DataFrame,其中包含以下列:
这是我的数据示例:
df <- data.frame(
season = c("2015/2016", "2015/2016"),
stage = c(1, 1),
home_team_api_id = c(1, 2),
away_team_api_id = c(2, 1),
home_team_goal = c(3, 2),
away_team_goal = c(1, 3),
match_api_id = c(101, 102)
)
我想将此 DataFrame 转换为长格式,其中每场比赛都有两行:一行为主队,一行为主队。转换后的 DataFrame 中应包含以下列:
期望的输出: 对于示例输入数据,所需的输出将如下所示:
season stage match_api_id team_api_id opponent_team_api_id goals goals_conceded is_home
1 2015/2016 1 101 1 2 3 1 TRUE
2 2015/2016 1 101 2 1 1 3 FALSE
3 2015/2016 1 102 2 1 2 3 TRUE
4 2015/2016 1 102 1 2 3 1 FALSE
这是我迄今为止尝试过的:
df_long <- df %>%
pivot_longer(cols = c(home_team_api_id, away_team_api_id),
names_to = "team_type",
values_to = "team_api_id") %>%
mutate(
is_home = ifelse(team_type == "home_team_api_id", TRUE, FALSE),
goals = ifelse(is_home, home_team_goal, away_team_goal),
goals_conceded = ifelse(is_home, away_team_goal, home_team_goal)
) %>%
select(match_api_id, season, stage, team_api_id, goals, goals_conceded, is_home)
# opponent_team_api_id basierend auf match_api_id anhängen
df_long <- df_long %>%
left_join(df %>%
select(match_api_id, home_team_api_id, away_team_api_id),
by = "match_api_id") %>%
mutate(
opponent_team_api_id = ifelse(is_home, away_team_api_id, home_team_api_id)
) %>%
select(-home_team_api_id, -away_team_api_id)
这是我的结果:
match_api_id season stage team_api_id goals goals_conceded is_home
1 2015/2016 1 1 1 3 TRUE
2 2015/2016 1 2 1 3 FALSE
3 2015/2016 1 2 2 3 TRUE
4 2015/2016 1 1 3 2 FALSE
使它变得困难并将其与这个问题分开的是,我想一次应用pivot_longer两次。我想要更长的目标和 teamID
如何在 R 中实现这种转变?任何帮助将不胜感激!
谢谢!
我会
pivot_longer
分成 4 行,然后 pivot_wider
回到 2 行。
main <- df |>
pivot_longer(cols = c(home_team_api_id, away_team_api_id, home_team_goal, away_team_goal)) |>
separate_wider_delim(name, delim = "_team_", names = c("is_home", "var")) |>
pivot_wider(names_from = var, values_from = value)
结果将是:
season stage match_api_id is_home api_id goal
<chr> <dbl> <dbl> <chr> <dbl> <dbl>
1 2015/2016 1 101 home 1 3
2 2015/2016 1 101 away 2 1
3 2015/2016 1 102 home 2 2
4 2015/2016 1 102 away 1 3
如果绝对有必要保留有关对手的信息(这似乎是多余的),请复制另一个数据集并将其合并回去。
sub <- main |>
mutate(is_home = ifelse(is_home == "home", "away", "home")) |>
rename_with(~ paste0("opponent_", .x), api_id:goal)
complete <- left_join(main, sub) |>
mutate(is_home = ifelse(is_home == "home", TRUE, FALSE))
结果将是:
season stage match_api_id is_home api_id goal opponent_api_id opponent_goal
<chr> <dbl> <dbl> <lgl> <dbl> <dbl> <dbl> <dbl>
1 2015/2016 1 101 TRUE 1 3 2 1
2 2015/2016 1 101 FALSE 2 1 1 3
3 2015/2016 1 102 TRUE 2 2 1 3
4 2015/2016 1 102 FALSE 1 3 2 2