问题
我有一个大数据框(388 x 729),我正在尝试计算 14 年来每月(因子列)每年(数字列)新水仙花球茎(数字列)的频率。
我已经成功计算出从 2012 年到 2024 年每年每月发生该实验的天数。
现在我想向之前使用 R 代码创建的数据框添加一列,以显示每年每月新水仙花球茎的频率(见下文)。
有没有办法同时进行这两项计算?
提前非常感谢
数据框结构:
$Year : num 2012 2012 2012 2012 2012 ...
$ Month : Factor w/ 18 levels "April","April ",..: 9 8 8 8 8 8 8 8 8 1 ...
$ Daffodil Bulbs : num 0 3 0 3 2 1 0 0 0 0 ...
用于计算每年每月进行此实验的天数的 R 代码(2012 年至 2024 年)
Frequency_Days_Experiment<-MyDf %>% mutate(Month = factor(trimws(Month), levels = month.name, ordered = TRUE)) %>%
group_by(Year, Month) %>%
count()
虚拟代码:
tibble(
Month = sample(month.name, 120, replace = TRUE),
Year = sample(2012:2024, 120, replace = TRUE),
Number_Daffodils = sample(1:5, 120, replace = TRUE)
)
R 代码的输出
# A tibble: 96 × 3
# Groups: Year, Month [96]
Year Month n
<dbl> <ord> <int>
1 2012 January 1
2 2012 February 8
3 2012 April 18
4 2012 May 21
5 2012 June 27
6 2012 July 12
7 2012 October 12
8 2012 November 4
9 2012 December 3
10 2013 February 2
# ℹ 86 more rows
所需输出
# A tibble: 96 × 4
# Groups: Year, Month, Frequency_Daffodils [96]
Year Month n n
<dbl> <ord> <int> <int>
1 2012 January 1 25
2 2012 February 8 5
3 2012 April 18 13
4 2012 May 21 45
5 2012 June 27 9
6 2012 July 12 78
7 2012 October 12 12
8 2012 November 4 62
9 2012 December 3 1
10 2013 February 2 8
# ℹ 86 more rows
这是Limey 提供的解决方案
New_Bulbs_Frequency<-MyDf %>% mutate(Month = factor(trimws(Month), levels = month.name, ordered = TRUE)) %>%
group_by(Year, Month) %>%
summarise(N = n(), New_Daffodils = sum(Frequncy_Daffodils))