如何使用函数 lubticate () 和 Month() 管理有序因子来设置时间分量,因为它们无法在 R 中读取它们

问题描述 投票:0回答:1

问题

我使用 dplyr 计算了每年每月的观察次数,以确保从 1 月到 12 月的月份顺序正确,从而产生了有序因子

我想使用函数lubricate()和month()来正确设置年和月的组成部分以进行时间序列分析。

函数 lubricate() 无法处理有序因子(请参阅 R 代码和错误消息)。我尝试使用 x <- factor( x , ordered = FALSE) 取消此列的排序,但丢失了数据框中除 'Month' 之外的所有信息。

我尝试将“月份”列设置为基本因子水平,但我得到了以下输出:

 Bulbs$Month <- as.factor(Bulbs$Month)    

 #Error 

 Error in `$<-.data.frame`(`*tmp*`, Month, value = integer(0)) : 
  replacement has 0 rows, data has 96

有谁知道如何将有序因子转换回正常因子而不丢失排序级别?

非常感谢您能帮忙吗?

使用 Dplyr 计算后的 Dataframe 结构

'data.frame':   96 obs. of  4 variables:
 $ Year          : num  2012 2012 2012 2012 2012 ...
 $ Month           : Ord.factor w/ 12 levels "January"<"February"<..: 1 2 4 5 6 7 10 11 12 2 ...
 $ Number_Daffodils     : num  1 8 18 21 27 12 12 4 3 2 ...
 $ Frequency_New_Bulbs : num  7 59 144 193 NA NA 143 22 14 26 ..

R 代码

library(dplyr)
library(lubricate)

Bulbs<-MyDf %>% mutate(Month = factor(trimws(Month), levels = month.name, ordered = TRUE)) %>% 
                                group_by(Year, Month) %>% 
                                summarise(N = n(), Frequency_New_Bulbs = sum(Number_Daffodils))

#Set the components for the time series analysis

Bulbs <- janitor::clean_names(Bulbs)
Bulbs$Year <- lubridate::ymd(paste(Bulbs$year, Bulbs$month, "01", sep = "-"))
Bulbs$month = lubridate::month(Bulbs$month)

#When I run the line **dat$month = lubridate::month(dat$month)** I get this error message. 

Error in as.POSIXlt.character(as.character(x), ...) : 
  character string is not in a standard unambiguous format
In addition: Warning message:
tz(): Don't know how to compute timezone for object of class ordered/factor; returning "UTC". 

虚拟数据框

tibble(
       Month = sample(month.name, 120, replace = TRUE),
       Year = sample(2012:2024, 120, replace = TRUE),
       Number_Daffodils = sample(1:5, 120, replace = TRUE)
      ) 

所需输出

 year    month Number_Daffodils Frequency_New_Bulbs       date n_month
1 2015  January             36                   31 2015-01-01       1
2 2015 February             28                   28 2015-02-01       2
3 2015    March             39                   31 2015-03-01       3
4 2015    April             46                   30 2015-04-01       4
5 2015      May              5                    6 2015-05-01       5
6 2015     June              0                    0 2015-06-01       6    
r dplyr time-series tidyverse factors
1个回答
0
投票

如果您的

Month
因子水平正确,您可以将其转换为整数或直接与
lubridate::make_date()
一起使用:

library(dplyr)

Bulbs |> 
  janitor::clean_names() |> 
  mutate(date = lubridate::make_date(year = year, month = month),
         m = as.integer(month))
#> # A tibble: 86 × 6
#> # Groups:   year [13]
#>     year month         n frequency_new_bulbs date           m
#>    <int> <ord>     <int>               <int> <date>     <int>
#>  1  2012 January       1                   2 2012-01-01     1
#>  2  2012 February      4                   9 2012-02-01     2
#>  3  2012 April         1                   4 2012-04-01     4
#>  4  2012 May           3                  10 2012-05-01     5
#>  5  2012 June          1                   2 2012-06-01     6
#>  6  2012 July          1                   2 2012-07-01     7
#>  7  2012 August        2                   6 2012-08-01     8
#>  8  2012 September     1                   2 2012-09-01     9
#>  9  2012 October       1                   3 2012-10-01    10
#> 10  2012 November      2                   9 2012-11-01    11
#> # ℹ 76 more rows
© www.soinside.com 2019 - 2024. All rights reserved.