我正在根据 Joseph Adler 的书“Baseball Hacks”工作。我使用的数据集来自:https://www.seanlahman.com/baseball-archive/statistics/
我想解决一个问题,我需要为在给定年份参加多个球队的球员更正数据集。
setwd('C:/.../baseballdatabank-2023.1/core')
batting <- read.csv("Batting.csv")
attach(batting)
# note a lot of NA values because players may not have even had a single AB
# to make this data useful, reduce it to those players who have enough
# batting experience to qualify for a batting title
# to qualify, a player needs at least 3.1 plate appearances for each team game
t <- subset(teams, select=c(teamID, yearID, G))
names(t) <- c("teamID", "yearID", "teamG")
b_and_t <- merge(batting, t, by=c("yearID", "teamID"))
b_and_t$AVG <- b_and_t$H / b_and_t$AB
b_and_t$qualify <- (b_and_t$AB + ifelse(is.na(b_and_t$BB), 0, b_and_t$BB)
+ ifelse(is.na(b_and_t$HBP), 0, b_and_t$HBP)
+ ifelse(is.na(b_and_t$SF), 0, b_and_t$SF)) > 3.1 * b_and_t$teamG
如何汇总一年内拥有多个球队的球员的数据?即,我如何结合单个玩家年的统计数据?