插入新行指示行之间的时间间隙

问题描述 投票:0回答:1

我正在处理演讲稿:

  Utterance                       Starttime_ms Endtime_ms
  <chr>                                  <dbl>      <dbl>
1 on this                                  210        780
2 okay                                    3403       3728
3 cool thanks everyone um                 4221       5880
4 so yes in terms of our projects         5910      11960
5 let's have a look so the               11980      13740
6 LGBTQ plus                             13813      16110

并且希望在每个

Utterance
之后插入一个新行,指示与前一个
Utterance
相比的时间间隙。 所需的输出看起来有点像这样:

  Utterance                       Starttime_ms Endtime_ms
  <chr>                                  <dbl>      <dbl>
1 on this                                  210        780
  NA                                       780       3403
2 okay                                    3403       3728
  NA                                      3728       4221
3 cool thanks everyone um                 4221       5880
  NA                                      5880       5910
4 so yes in terms of our projects         5910      11960
  NA                                     11960      11980
5 let's have a look so the               11980      13740
  NA                                     13740      13813
6 LGBTQ plus                             13813      16110

我知道怎么做

data.table
:

library(data.table)
unq <- c(0, sort(unique(setDT(df)[, c(Starttime_ms, Endtime_ms)])))
df <- df[.(unq[-length(unq)], unq[-1]), on=c("Starttime_ms", "Endtime_ms")]

但我正在寻找

dplyr
解决方案。

数据:

df <-   structure(list(Utterance = c("on this", "okay", "cool thanks everyone um", 
                                     "so yes in terms of our projects", 
                                     "let's have a look so the", "LGBTQ plus"), Starttime_ms = c(210, 
                                                                                                 3403, 4221, 5910, 11980, 13813), Endtime_ms = c(780, 3728, 5880, 
                                                                                                                                                 11960, 13740, 16110)), row.names = c(NA, -6L), class = c("tbl_df", 
                                                                                                                                                                                                          "tbl", "data.frame"))
r dplyr
1个回答
0
投票
library(dplyr)

df |>
  mutate(Utterance = NA, 
         local({
           data.frame(Starttime_ms = lag(Endtime_ms), Endtime_ms = Starttime_ms)
         })) |>
  filter(!is.na(Starttime_ms)) |>
  bind_rows(df) |>
  arrange(Starttime_ms)

输出

   Utterance                       Starttime_ms Endtime_ms
   <chr>                                  <dbl>      <dbl>
 1 on this                                  210        780
 2 NA                                       780       3403
 3 okay                                    3403       3728
 4 NA                                      3728       4221
 5 cool thanks everyone um                 4221       5880
 6 NA                                      5880       5910
 7 so yes in terms of our projects         5910      11960
 8 NA                                     11960      11980
 9 let's have a look so the               11980      13740
10 NA                                     13740      13813
11 LGBTQ plus                             13813      16110
© www.soinside.com 2019 - 2024. All rights reserved.