当某些单元格不包含要使用 R 中的 dplyr 拆分的两个值时,如何将一列拆分为两列以分隔单元格中的两个值

问题描述 投票:0回答:1

问题

我想将数据框中名为“日期”的列拆分为名为“日期”和“Survey_Number”的两列

“日期”列包含一个日期“08/04/2012”或日期和调查编号“24/04/2024 [S2]”(见下文)。

我可以使用 dplyr 分割单元格(代码如下);但是,某些单元格不包含要拆分的两个值,并且我发现日期和调查编号往往会被更改。此外,如果当天只进行了一项调查,则新列“Survey_Number”中与该日期关联的调查编号将为零。

我尝试将 separate 函数与 dplyr 一起使用,并将 str_split_fixed 函数与 stringr 一起使用。

有人可以帮我吗?

谢谢,

R 代码

#Split the cell 'Date'to create two columns called 'Date' and 'Survey Number'

library(dplyr)
library(tidyr)

# Split name column into Date and Survey_Number

Df <- Df %>% separate(Date, c('Date', 'Survey_Number'))
Df

更改日期和调查编号

  Df$Survey_Number
      [1] "01" "02" "02" "02" "02" "02" "02" "02" "02" "03" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04"
     [28] "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" 

  Df$Date
  [1] "08" "06" "07" "12" "13" "14" "15" "20" "22" "23" "08" "15" "15" "16" "16" "20" "21" "21" "23" "24" "24" "27" "28" "29" "29" "29" "30"
 [28] "02" "03" "03" "04" "04" "05" "05" "05" "08" "09" "10" "13" "23" "26" "26" "27" 

预期输出

enter image description here 数据框

structure(list(Survey_No = 1:20, Date = structure(c(133L, 99L, 
114L, 235L, 256L, 282L, 301L, 424L, 475L, 499L, 139L, 305L, 306L, 
326L, 327L, 430L, 453L, 454L, 502L, 534L), levels = c("01/03/2016 [S1]", 
"01/03/2016 [S2]", "01/04/2017 [S1]", "01/04/2017 [S2]", "01/05/2014", 
"01/06/2012 [S1]", "01/06/2012 [S2]", "01/06/2012 [S3]", "01/06/2012?", 
"01/06/2015 [S1]", "01/06/2015 [S2]", "01/06/2015 [S3]", "01/06/2015 [S4]", 
"01/06/2024", "01/07/2012 [S1]", "01/07/2012 [S2]", "01/07/2024", 
"01/08/2024", "01/09/2021 [S1]", "01/09/2021 [S2]", "01/12/2012 [S1]", 
"01/12/2015", "01/12/2017", "01/12/2023 [S1]", "02/02/2021 [S1]", 
"02/02/2021 [S2]", "02/03/2014", "02/03/2016", "02/05/2012", 
"02/05/2016 [S1]", "02/05/2016 [S2]", "02/05/2016 [S3]", "02/05/2022 [S1]", 
"02/05/2022 [S2]", "02/05/2022 [S3]", "02/06/2012 [S1]", "02/06/2012 [S2]", 
"02/07/2012", "02/07/2014", "02/07/2024 [S1]", "02/07/2024 [S2]", 
"02/09/2023", "02/11/2012", "02/12/2015", "02/12/2018", "03/01/2023", 
"03/05/2012 [S1]", "03/05/2012 [S2]", "03/05/2013", "03/06/2012 [S1]", 
"03/06/2012 [S2]", "03/06/2012 [S3]", "03/06/2015 [S1]", "03/06/2015 [S2]", 
"03/07/2024", "03/09/2016 [S1]", "03/09/2016 [S2]", "03/09/2020", 
"03/11/2012 [S2]", "03/11/2015", "03/12/2014", "03/12/2023", 
"04/04/2017", "04/05/2012 [S1]", "04/05/2012 [S2]", "04/05/2013", 
"04/06/2015", "04/06/2022", "04/06/2023", "04/07/2012", "04/07/2023", 
"04/09/2019 [S1]", "04/09/2019 [S2]", "04/10/2014", "04/11/2013", 
"04/12/2023 [S1]", "04/12/2023 [S2]", "05/02/2018", "05/03/2020", 
"05/04/2017", "05/04/2021 [S1]", "05/04/2021 [S2]", "05/05/2012 [S1]", 
"05/05/2012 [S2]", "05/05/2012 [S3]", "05/06/2015", "05/06/2018 [S1]", 
"05/06/2018 [S2]", "05/06/2018 [S3]", "05/06/2018 [S4]", "05/06/2018 [S5]", 
"05/06/2022", "05/07/2024", "05/09/2019", "05/09/2020", "05/10/2023 [S1]", 
"05/12/2023 [S1]", "05/12/2023 [S2]", "06/02/2012", "06/02/2017", 
"06/03/2020", "06/04/2017 [S1]", "06/04/2017 [S2]", "06/05/2024 [S1]", 
"06/05/2024 [S2]", "06/05/2024 [S3]", "06/06/2015", "06/06/2018 [S1]", 
"06/06/2018 [S2]", "06/06/2022 [S1]", "06/06/2022 [S2]", "06/06/2022 [S3]", 
"06/07/2024", "07/02/2012", "07/02/2018", "07/02/2021", "07/03/2014", 
"07/05/2014", "07/05/2024 [S1]", "07/05/2024 [S2]", "07/06/2012", 
"07/06/2016 [S1]", "07/06/2016 [S2]", "07/06/2022 [S1]", "07/06/2022 [S2]", 
"07/06/2022 [S3]", "07/07/2024 [S1]", "07/07/2024 [S2]", "07/09/2013", 
"07/11/2013", "07/12/2023 [S1]", "07/12/2023 [S2]", "08/01/2012", 
"08/01/2022", "08/02/2021", "08/02/2024", "08/03/2013", "08/03/2022", 
"08/04/2012", "08/05/2012", "08/05/2015 [S1]", "08/05/2015 [S2]", 
"08/05/2015 [S3]", "08/05/2017", "08/05/2018 [S1]", "08/05/2018 [S2]", 
"08/05/2023", "08/05/2024 [S1]", "08/05/2024 [S2]", "08/06/2013", 
"08/06/2014", "08/06/2015", "08/06/2016", "08/06/2022", "08/07/2012", 
"08/09/2020", "08/10/2012 [S1]", "08/10/2012 [S2]", "08/10/2015", 
"08/11/2021", "09/02/2016", "09/02/2021", "09/02/2024", "09/03/2020", 
"09/05/2012", "09/05/2015 [S1]", "09/05/2015 [S2]", "09/05/2016 [S1]", 
"09/05/2016 [S2]", "09/05/2016 [S3]", "09/05/2016 [S4]", "09/05/2017", 
"09/05/2018 [S1]", "09/05/2018 [S2]", "09/05/2018 [S3]", "09/05/2022", 
"09/05/2024", "09/06/2012", "09/06/2015", "09/06/2022", "09/07/2012", 
"09/07/2013", "09/07/2024 [S1]", "09/07/2024 [S2]", "09/11/2012", 
"09/12/2016", "09/12/2018", "1/12/2013 [S1]", "1/12/2013 [S2]", 
"10/03/2020", "10/03/2021", "10/05/2012", "10/05/2013", "10/05/2016", 
"10/05/2017 [S1]", "10/05/2017 [S2]", "10/05/2024", "10/06/2014", 
"10/06/2015", "10/06/2022 [S1]", "10/06/2022 [S2]", "10/07/2024 [S1]", 
"10/07/2024 [S2]", "10/07/2024 [S3]", "10/11/2014", "10/11/2018", 
"10/11/2021", "10/12/2014", "10/12/2021 [S1]", "10/12/2021 [S2]", 
"11.01/2023 [S2]", "11/01/2021", "11/01/2023 [S1]", "11/03/2013 [S1]", 
"11/03/2013 [S2]", "11/03/2020", "11/03/2021", "11/03/2023", 
"11/05/2016", "11/06/2013", "11/06/2015 [S1]", "11/06/2015 [S2]", 
"11/06/2022 [S1]", "11/06/2022 [S2]", "11/07/2012", "11/07/2024 [S1]", 
"11/07/2024 [S2]", "11/07/2024 [S3]", "11/09/2020", "11/11/2014 [S1]", 
"11/11/2014 [S2]", "11/11/2014 [S3]", "11/11/2018", "11/12/2016", 
"12/02/2012", "12/03/2013", "12/04/2017", "12/05/2023", "12/06/2015", 
"12/06/2016", "12/06/2021 [S1]", "12/06/2021 [S2]", "12/06/2021 [S3]", 
"12/06/2021 [S4]", "12/06/2022 [S1]", "12/06/2022 [S2]", "12/09/2020", 
"12/10/2016 [S1]", "12/10/2016 [S2]", "12/11/2019", "12/11/2021", 
"12/12/2014 [S1]", "12/12/2014 [S2]", "12/12/2016", "12/12/2020", 
"13/02/2012", "13/02/2020", "13/02/2022", "13/04/2017", "13/04/2018 [S1]", 
"13/04/2018 [S2]", "13/05/2012", "13/05/2017 [S1]", "13/05/2017 [S2]", 
"13/05/2017 [S3]", "13/05/2018 [S1]", "13/05/2018 [S2]", "13/05/2019 [S1]", 
"13/05/2019 [S2]", "13/06/2016 [S1]", "13/06/2016 [S2]", "13/06/2016 [S3]", 
"13/06/2021", "13/06/2022 [S1]", "13/06/2022 [S2]", "13/10/2014", 
"13/10/2015", "13/10/2016 [S1]", "13/10/2016 [S2]", "13/10/2016 [S3]", 
"13/11/2013", "14/02/2012", "14/02/2015", "14/04/2019 [S1]", 
"14/04/2019 [S2]", "14/05/2014", "14/05/2019", "14/05/2022 [S1]", 
"14/05/2022 [S2]", "14/05/2022 [S3]", "14/05/2022 [S4]", "14/06/2016", 
"14/06/2022 [S1]", "14/06/2022 [S2]", "14/07/2012", "14/09/2013", 
"14/11/2017", "14/11/2021 [S1]", "15/01/2023 [S1]", "15/01/2023 [S2]", 
"15/02/2012", "15/02/2016", "15/02/2020", "15/03/2021", "15/04/2012 [S1]", 
"15/04/2012 [S2]", "15/04/2016", "15/04/2018", "15/04/2019 [S1]", 
"15/04/2019 [S2]", "15/05/2015 [S2]", "15/05/2016 [S1]", "15/05/2016 [S2]", 
"15/05/2016 [S3]", "15/06/2012", "15/06/2022", "15/10/2012", 
"15/11/2015", "15/11/2017", "15/11/2018", "15/12/2013", "15/12/2018 [S1]", 
"15/12/2018 [S2]", "16/02/2016", "16/02/2024", "16/04/2012 [S1]", 
"16/04/2012 [S2]", "16/04/2014 [S1]", "16/04/2014 [S2]", "16/04/2018 [S1]", 
"16/04/2018 [S2]", "16/04/2018 [S3]", "16/04/2018 [S4]", "16/04/2018 [S5]", 
"16/04/2022", "16/05/2016 [S1]", "16/05/2016 [S2]", "16/05/2023 [S1]", 
"16/05/2023 [S2]", "16/06/2019", "16/07/2012", "16/11/2015", 
"16/11/2017", "16/12/2014", "16/12/2018 [S1]", "16/12/2018 [S2]", 
"16/12/2018 [S3]", "17/01/2023", "17/02/2023 [S1]", "17/02/2023 [S2]", 
"17/02/2024 ", "17/03/2017", "17/04/2016 [S1]", "17/04/2016 [S2]", 
"17/04/2016 [S3]", "17/05/2017 [S1]", "17/05/2017 [S2]", "17/06/2012 [S11]", 
"17/06/2012 [S2]", "17/06/2012 [S3]", "17/06/2012 [S4]", "17/06/2022", 
"17/06/2023 [S1]", "17/06/2023 [S2]", "17/06/2023 [S3]", "17/06/2023 [S4]", 
"17/06/2024", "17/07/2012", "17/10/2012 [S1]", "17/11/2015", 
"17/12/2012", "17/12/2013", "17/12/2014 [S1]", "17/12/2014 [S2]", 
"17/14/2018 [S1]", "17/14/2018 [S2]", "17/14/2018 [S3]", "18/02/2016 [S1]", 
"18/02/2016 [S2]", "18/02/2024", "18/03/2019 [S1]", "18/03/2019 [S2]", 
"18/03/2019 [S3]", "18/04/2017 [S1]", "18/04/2017 [S2]", "18/04/2018 [S1]", 
"18/04/2018 [S2]", "18/05/2017 [S1]", "18/05/2017 [S2]", "18/05/2017 [S3]", 
"18/05/2017 [S4]", "18/06.2016 [S1]", "18/06.2016 [S2]", "18/06.2016 [S3]", 
"18/06.2016 [S4]", "18/06.2016 [S5]", "18/06/2012", "18/06/2024 [S1]", 
"18/10/2012 [S1]", "18/10/2012 [S2]", "18/11/2014", "18/11/2015 [S1]", 
"18/11/2015 [S2]", "18/12/2014 [S1]", "18/12/2017 [S1]", "18/12/2017 [S2]", 
"19/02/2024", "19/04/2014 [S1]", "19/04/2014 [S2]", "19/05/2015 [S1]", 
"19/05/2015 [S2]", "19/06/2014", "19/06/2016", "19/06/2022", 
"19/06/2024", "19/09/2013", "19/09/2022", "19/11/2013 [S1]", 
"19/11/2013 [S2]", "19/11/2015", "19/11/2023", "19/12/2016", 
"19/12/2017", "20/02/2012", "20/02/2019", "20/02/2020 [S1]", 
"20/02/2020 [S2]", "20/02/2024", "20/03/2021", "20/04/2012", 
"20/04/2014 [S1]", "20/04/2014 [S2]", "20/04/2018 [S1]", "20/04/2018 [S2]", 
"20/04/2018 [S3]", "20/05/2023", "20/06/2013", "20/06/2016", 
"20/06/2019", "20/06/2022", "20/06/2024 [S1]", "20/06/2024 [S2]", 
"20/09/2020", "20/10/2023", "20/11/2018 [S1]", "20/11/2023", 
"20/12/2016", "21/01/2023", "21/02/202", "21/02/2021", "21/02/2024 [S1[", 
"21/02/2024 [S2]", "21/04/2012 [S1]", "21/04/2012 [S2]", "21/04/2016", 
"21/04/2017 [S1]", "21/04/2017 [S2]", "21/05/2014", "21/05/2016 [S1]", 
"21/05/2016 [S2]", "21/05/2016 [S3]", "21/05/2023", "21/06/2012", 
"21/06/2013", "21/06/2024 [S1]", "21/11/2015", "21/11/2016 [S1]", 
"21/11/2016 [S2]", "21/11/2018", "21/11/2023 [S1]", "21/11/2023 [S2]", 
"21/12/2016 [S1]", "21/12/2016 [S2]", "21/12/2016 [S3]", "22/02/2012", 
"22/02/2013", "22/02/2022 [S2]", "22/02/2022 [S3]", "22/02/2024", 
"22/04/2014", "22/04/2017", "22/05/2016 [S1]", "22/05/2016 [S2]", 
"22/05/2023 [S1]", "22/05/2023 [S2]", "22/05/2023 [S3]", "22/06/2012 [S1]", 
"22/06/2012 [S2]", "22/06/2012 [S3]", "22/06/2013", "22/06/2024", 
"22/11/2013", "22/11/2015", "22/11/2016 [S1]", "22/11/2016 [S2]", 
"22/11/2019", "22/11/2021", "22/11/2023", "23/03/2012", "23/03/2014", 
"23/03/2023", "23/04/2012", "23/04/2014 [S1]", "23/04/2014 [S2]", 
"23/04/2018", "23/05/2012", "23/06/2012 [S1]", "23/06/2012 [S3]", 
"23/06/2013", "23/06/2014 {S1]", "23/06/2014 {S2]", "23/06/2014 {S3]", 
"23/06/2022", "23/07/2012 [S1]", "23/08/2020", "23/08/2021", 
"23/10/2012 [S1]", "23/10/2012 [S2]", "23/11/2013 [S1]", "23/11/2013 [S2]", 
"23/11/2015", "23/11/2023 [S1]", "23/11/2023 [S2]", "24/02/2013", 
"24/02/2016", "24/02/2019", "24/03/2014 [S1]", "24/03/2014 [S2]", 
"24/03/2023 [S1]", "24/03/2023 [S2]", "24/04/2016 [S1]", "24/04/2016 [S2]", 
"24/04/2018", "24/04/2024 [S1]", "24/06/2013", "24/06/2014", 
"24/06/2022", "24/06/2024 [S1]", "24/06/2024 [S2]", "24/06/2024 [S3]", 
"24/08/2020", "24/08/2021", "24/11/2015", "24/11/2023", "25/03/2014", 
"25/04/2014 [S1]", "25/04/2014 [S2]", "25/04/2014 [S3]", "25/04/2018", 
"25/05/2018 [S2]", "25/05/2021 [S1]", "25/05/2021 [S2]", "25/06/2012", 
"25/06/2024 [S1]", "25/06/2024 [S2]", "25/08/2021 [S1]", "25/08/2021 [S2]", 
"25/10/2013", "25/10/2014", "25/10/2023", "25/11/2015", "26/02/2013", 
"26/02/2021", "26/02/2024 [S1]", "26/03/2013", "26/04/2016 [S1]", 
"26/04/2016 [S2]", "26/04/2016 [S3]", "26/04/2016 [S4]", "26/05/2012 [S1]", 
"26/05/2012 [S2]", "26/05/2014 [S1]", "26/05/2014 [S2]", "26/05/2016 [S1]", 
"26/05/2016 [S2]", "26/05/2016 [S3]", "26/05/2018 [S1]", "26/06/2012 [S1]", 
"26/06/2012 [S2]", "26/06/2016 [S1]", "26/06/2016 [S2]", "26/06/2016 [S3]", 
"26/06/2016 [S4]", "26/06/2016 [S5]", "26/06/2024", "26/10/2023 [S1]", 
"26/10/2023 [S2]", "26/11/2014", "26/11/2015", "26/11/2016 [S1]", 
"26/11/2016 [S2]", "26/11/2018 [S1]", "26/11/2018 [S2]", "27/01/2021", 
"27/02/2015", "27/02/2016", "27/02/2020", "27/02/2021", "27/02/2024 [S1]", 
"27/02/2024 [S2]", "27/03/2014 [S1]", "27/03/2014 [S2]", "27/03/2018 [S1]", 
"27/03/2018 [S3]", "27/04/2012", "27/05/2012 [S1]", "27/06/2024", 
"27/10/2012 [S1]", "27/10/2012 [S2]", "27/10/2014", "27/10/2023 [S1]", 
"27/10/2023 [S2]", "27/11/2013 [S1]", "27/11/2013 [S2]", "27/11/2016", 
"27/11/2017 [S1]", "27/11/2017 [S2]", "27/11/2018", "27/11/2023", 
"27/12/2013", "28/02/2015", "28/02/2016 [S1]", "28/02/2016 [S2]", 
"28/02/2016 [S3]", "28/02/2020 [S1]", "28/03/2014 [S1]", "28/03/2014 [S2]", 
"28/03/2019 [S1]", "28/03/2019 [S2]", "28/03/2019 [S3]", "28/04/2012", 
"28/04/2017", "28/05/2014", "28/05/2021", "28/10/2014", "28/11/2013", 
"28/11/2015", "28/11/2016", "28/11/2017", "29/02/2016", "29/02/2024", 
"29/04/2012 [S1]", "29/04/2012 [S2]", "29/04/2012 [S3]", "29/04/2017 [S1]", 
"29/04/2017 [S2]", "29/04/2021", "29/05/2012", "29/05/2013", 
"29/06/2016", "29/07/2019 [S1]", "29/08/2013", "29/09/2016", 
"29/09/2022 [S1]", "29/09/2022 [S2]", "29/10/2012", "29/10/2014 [S1]", 
"29/10/2014 [S2]", "29/11/2015", "29/11/2017", "29/11/2018 [S1]", 
"29/11/2018 [S2]", "29/11/2021", "29/11/2023", "3/05/2014 [S1]", 
"3/05/2014 [S2]", "3/7/2014 [S1]", "3/7/2014 [S2]", "30/03/2014 [S1]", 
"30/04/2012", "30/04/2016 [S1]", "30/04/2016 [S2]", "30/04/2017 [S1]", 
"30/04/2017 [S2]", "30/05/2012 [S1]", "30/05/2012 [S2]", "30/05/2013", 
"30/05/2015", "30/05/2018 [S1]", "30/05/2018 [S2]", "30/05/2023", 
"30/06/2012", "30/06/2024", "30/08/2020", "30/08/2021", "30/09/2022", 
"30/11/2015", "30/11/2021", "30/11/2023", "31/03/2014", "31/03/2017 [S1]", 
"31/03/2017 [S2]", "31/05/2012", "31/05/2015", "31/05/2023 [S1]", 
"31/05/2023 [S2]", "31/10/2021", "31/10/2023", "4/12/2014 [S1]", 
"4/12/2014 [S2]", "5/12/2014 [S1]", "5/12/2014 [S2]", "8/11/2014 [S1]", 
"8/11/2014 [S2]", "9/11/2013 [S1]", "9/11/2013 [S2]", "9/12/2014 [S1]", 
"9/12/2014 [S2]", "9/12/2014 [S3]"), class = "factor")), row.names = c(NA, 
20L), class = "data.frame")
r dataframe variables dplyr split
1个回答
0
投票

您可以使用

gsub()
来执行此操作,它将精确搜索/替换您告诉它要查找的文本中的模式。

这是一个例子:

df <- data.frame(Survey_No=1:20,Date=as.factor(c('08/01/2012','06/02/2012','07/02/2012',
          '12/02/2012','13/02/2012','14/02/2012','15/02/2012','20/02/2012','22/02/2012',
          '23/03/2012','08/04/2012','15/04/2012 [S1]','15/04/2012 [S2]','16/04/2012 [S1]',
          '16/04/2012 [S2]','20/04/2012','21/04/2012 [S1]','21/04/2012 [S2]','23/04/2012',
          '24/04/2024 [S1]')))

df$Date <- as.character(df$Date) # Later steps will treat date as a character, convert it from a factor
df$date_only <- gsub('\\s.*','',df$Date)
df$survnum_only <- ifelse(grepl('\\s',df$Date),gsub('.*\\s','',df$Date),NA)

df$date_only <- gsub('\\s.*','',df$Date)
将匹配空格 (
\\s
) 及其后面的任何字符 (
.*
) 并将其替换为空,从而为您提供仅包含日期的列。

df$survnum_only <- ifelse(grepl('\\s',df$Date),gsub('.*\\s','',df$Date),NA)
首先检查 df$Date 是否包含空格 (
grepl('\\s',df$Date)
)——如果不包含,则将 survnum_only 设置为 NA。在 df$Date 确实包含空格的行中,它使用
gsub('.*\\s','',df$Date)
删除空格之前的所有字符以及空格本身 (
.*\\s
)。

在执行任何操作之前,它将日期变量从因子(原始示例中的数据类型)转换为字符变量,以确保需要字符变量的函数能够正确处理它。

© www.soinside.com 2019 - 2024. All rights reserved.