如何使用函数 lubridate () 和 Month() 管理有序因子来设置时间分量，因为它们无法在 R 中读取它们

Question

我使用 dplyr 计算了每年每月的观察次数，以确保从 1 月到 12 月的月份顺序正确，从而产生了有序因子。

我想使用函数

lubridate()

和

month()

正确设置年和月的组成部分以进行时间序列分析。

函数

lubridate()

无法处理有序因子（请参阅R代码和错误消息）。我尝试使用

x <- factor( x , ordered = FALSE)

取消对这一列的排序，但我丢失了数据框中除

Month

之外的所有信息。

我尝试将“月份”列设置为基本因子水平，但我得到了以下输出：

 Bulbs$Month <- as.factor(Bulbs$Month)

$<-.data.frame
（
*tmp*
，月份，值=整数（0））中的错误：替换有 0 行，数据有 96

有谁知道如何将有序因子转换回正常因子而不丢失排序级别？

使用

dplyr

计算后的数据框结构：

'data.frame':   96 obs. of  4 variables:
 $ Year          : num  2012 2012 2012 2012 2012 ...
 $ Month           : Ord.factor w/ 12 levels "January"<"February"<..: 1 2 4 5 6 7 10 11 12 2 ...
 $ Number_Daffodils     : num  1 8 18 21 27 12 12 4 3 2 ...
 $ Frequency_New_Bulbs : num  7 59 144 193 NA NA 143 22 14 26 ..

R代码：

library(dplyr)
library(lubricate)

Bulbs <- MyDf %>% mutate(Month = factor(trimws(Month), levels = month.name, ordered = TRUE)) %>% 
                                group_by(Year, Month) %>% 
                                summarise(N = n(), Frequency_New_Bulbs = sum(Number_Daffodils))

#Set the components for the time series analysis

Bulbs <- janitor::clean_names(Bulbs)
Bulbs$Year <- lubridate::ymd(paste(Bulbs$year, Bulbs$month, "01", sep = "-"))
Bulbs$month = lubridate::month(Bulbs$month)

#When I run the line **dat$month = lubridate::month(dat$month)** I get this error message. 

Error in as.POSIXlt.character(as.character(x), ...) : 
  character string is not in a standard unambiguous format
In addition: Warning message:
tz(): Don't know how to compute timezone for object of class ordered/factor; returning "UTC".

虚拟数据框

tibble(
       Month = sample(month.name, 120, replace = TRUE),
       Year = sample(2012:2024, 120, replace = TRUE),
       Number_Daffodils = sample(1:5, 120, replace = TRUE)
      )

所需输出

 year    month Number_Daffodils Frequency_New_Bulbs       date n_month
1 2015  January             36                   31 2015-01-01       1
2 2015 February             28                   28 2015-02-01       2
3 2015    March             39                   31 2015-03-01       3
4 2015    April             46                   30 2015-04-01       4
5 2015      May              5                    6 2015-05-01       5
6 2015     June              0                    0 2015-06-01       6

Answer 1

如果您的

Month

因子水平正确，您可以将其转换为整数或直接与

lubridate::make_date()

一起使用：

library(dplyr)

Bulbs |> 
  janitor::clean_names() |> 
  mutate(date = lubridate::make_date(year = year, month = month),
         m = as.integer(month))
#> # A tibble: 86 × 6
#> # Groups:   year [13]
#>     year month         n frequency_new_bulbs date           m
#>    <int> <ord>     <int>               <int> <date>     <int>
#>  1  2012 January       1                   2 2012-01-01     1
#>  2  2012 February      4                   9 2012-02-01     2
#>  3  2012 April         1                   4 2012-04-01     4
#>  4  2012 May           3                  10 2012-05-01     5
#>  5  2012 June          1                   2 2012-06-01     6
#>  6  2012 July          1                   2 2012-07-01     7
#>  7  2012 August        2                   6 2012-08-01     8
#>  8  2012 September     1                   2 2012-09-01     9
#>  9  2012 October       1                   3 2012-10-01    10
#> 10  2012 November      2                   9 2012-11-01    11
#> # ℹ 76 more rows

Answer 2

无润滑脂：

df |>
  mutate(
    n_month = match(Month, month.name), 
    date    = as.Date(sprintf("%d-%d-01", Year, n_month))
  )

   Month      Year Number_Daffodils n_month date
   <chr>     <int>            <int>   <int> <date>
 1 June       2018                1       6 2018-06-01
 2 June       2023                1       6 2023-06-01
 3 October    2023                5      10 2023-10-01
 4 March      2022                2       3 2022-03-01
 5 March      2017                5       3 2017-03-01
 6 March      2020                1       3 2020-03-01
 7 May        2018                1       5 2018-05-01
 8 December   2021                4      12 2021-12-01
 9 March      2015                4       3 2015-03-01
10 September  2015                2       9 2015-09-01
# ℹ 110 more rows

如何使用函数 lubridate () 和 Month() 管理有序因子来设置时间分量，因为它们无法在 R 中读取它们

问题描述投票：0回答：2

2个回答

最新问题

如何使用函数 lubridate () 和 Month() 管理有序因子来设置时间分量，因为它们无法在 R 中读取它们

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2