我有一个包含日期时间字符列的数据集,采用 CEST/CET 时区(中欧当地时间)。时区偏移量末尾用+01:00/+02:00 表示。我想将其转换为 POSIXct 格式,以便稍后可以将其转换为 UTC,但是在 10 月的长时钟更改日,从凌晨 2 点到凌晨 3 点的额外时间被错误读取,因为时区偏移似乎被忽略了:
我的目标是让
reprex$datetime_CEST_CET_converted[4]
返回“2023-10-29 02:00:00 CET”而不是“2023-10-29 02:00:00 CEST”:
代表:
library(dplyr)
library(lubridate)
source <- data.frame(
datetime_CEST_CET_character = c("2023-10-29 00:00+02:00", "2023-10-29 01:00+02:00", "2023-10-29 02:00+02:00",
"2023-10-29 02:00+01:00", "2023-10-29 03:00+01:00", "2023-10-29 04:00+01:00")
)
reprex <- source %>%
mutate(datetime_CEST_CET_converted = as.POSIXct(datetime_CEST_CET_character, tz = "Europe/Paris"),
datetime_UTC = with_tz(datetime_CEST_CET_converted, tzone = "UTC"))
reprex$datetime_CEST_CET_converted[3]
reprex$datetime_CEST_CET_converted[4]
reprex$datetime_CEST_CET_converted[5] - hours(1)
我尝试在删除时区偏移中的冒号后在
format="%Y-%m-%d %H:%M+%z"
中添加 as.POSIXct()
,但结果是 NA:
source_without_colon_in_timezone <- data.frame(
datetime_CEST_CET_character = c("2023-10-29 00:00+0200", "2023-10-29 01:00+0200", "2023-10-29 02:00+0200",
"2023-10-29 02:00+0100", "2023-10-29 03:00+0100", "2023-10-29 04:00+0100")
)
reprex_without_colon_in_timezone <- source_without_colon_in_timezone %>%
mutate(datetime_CEST_CET_converted = as.POSIXct(datetime_CEST_CET_character, format="%Y-%m-%d %H:%M+%z", tz = "Europe/Paris"),
datetime_UTC = with_tz(datetime_CEST_CET_converted, tzone = "UTC"))
reprex_without_colon_in_timezone$datetime_CEST_CET_converted[3]
reprex_without_colon_in_timezone$datetime_CEST_CET_converted[4]
reprex_without_colon_in_timezone$datetime_CEST_CET_converted[5] - hours(1)
编写一个函数来进行转换。
下面,函数
convert_CEST_CET_UTC
使用基本 R 管道,因此它不依赖于 magrittr
的管道。它首先用加号分割输入字符串,然后从时区校正中提取日期时间。将这些结果传输到适当的 lubridate
函数后,它们是真实的日期和时间,因此可以添加。这个和就是返回值。
source <- data.frame(
datetime_CEST_CET_character = c("2023-10-29 00:00+02:00", "2023-10-29 01:00+02:00", "2023-10-29 02:00+02:00",
"2023-10-29 02:00+01:00", "2023-10-29 03:00+01:00", "2023-10-29 04:00+01:00")
)
convert_CEST_CET_UTC <- function(x) {
s <- x |> strsplit("\\+")
sapply(s, `[[`, 2L) |> lubridate::hm() -> tz
sapply(s, `[[`, 1L) |> lubridate::ymd_hm() -> hm
hm + tz
}
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
source %>%
mutate(datetime_UTC = convert_CEST_CET_UTC(datetime_CEST_CET_character))
#> datetime_CEST_CET_character datetime_UTC
#> 1 2023-10-29 00:00+02:00 2023-10-29 02:00:00
#> 2 2023-10-29 01:00+02:00 2023-10-29 03:00:00
#> 3 2023-10-29 02:00+02:00 2023-10-29 04:00:00
#> 4 2023-10-29 02:00+01:00 2023-10-29 03:00:00
#> 5 2023-10-29 03:00+01:00 2023-10-29 04:00:00
#> 6 2023-10-29 04:00+01:00 2023-10-29 05:00:00
创建于 2024 年 10 月 1 日,使用 reprex v2.1.0