这是示例数据
stfips <- c("39","39","39")
year <- c("2023", "2023","2023")
industry_code <- c(112, 113, 114)
first_quarter_establishments <- c(987,654,321)
county <- data.frame(stfips, year, industry_code, first_quarter_establishments)
当前的任务是创建一个名为 period 的新列,其值为 01。01 的原因是它代表第一季度。如果第四列的名称中包含“Second”一词,则句点将为“02”,依此类推。以下是我从 ChatGPT 获得的信息。错误如下。知道如何根据专栏的措辞创建这个时期专栏吗?
first_columns <- grepl("first", names(county), ignore.case = TRUE)
county$period <- ifelse(first_columns, "01", "")
Error in `$<-.data.frame`(`*tmp*`, period, value = c("", "", "", "01")) :
replacement has 4 rows, data has 3
想要的最终结果
stfips year industry_code first_quarter_establishments period
39 2023 112 987 01
39 2023 113 654 01
39 2023 114 321 01
我确信有更优雅的解决方案,但在基础 R 中,您可以将
match
与 gsub
结合使用,从数据框的 names
中识别季度:
quarters <- c("first" = 1, "second" = 2,
"third" = 3, "fourth" = 4)
county$quarter <- quarters[match(gsub("(.+?)(\\_.*)", "\\1", names(county[4])),
names(quarters))]
输出:
# stfips year industry_code first_quarter_establishments quarter
# 1 39 2023 112 987 1
# 2 39 2023 113 654 1
# 3 39 2023 114 321 1
如果将其更改为第二个:
second_quarter_establishments <- c(987,654,321)
county <- data.frame(stfips, year, industry_code, second_quarter_establishments)
county$quarter <- quarters[match(gsub("(.+?)(\\_.*)", "\\1", names(county[4])),
names(quarters))]
# stfips year industry_code second_quarter_establishments quarter
# 1 39 2023 112 987 2
# 2 39 2023 113 654 2
# 3 39 2023 114 321 2