坚持R中的计算假设我有以下数据帧:
Name | Date | Count
Bob | 2019-03-03 | 253
Bob | 2019-03-03 | 253
Bob | 2019-03-02 | 252
Bob | 2019-03-01 | 251
Tim | 2019-03-04 | 257
Tim | 2019-03-04 | 257
Tim | 2019-03-04 | 256
Tim | 2019-03-03 | 254
我的目标是设计一个绝对变化的列,如下所示:
Name | Date | Count | Change
Bob | 2019-03-03 | 253 | 0
Bob | 2019-03-03 | 253 | 1
Bob | 2019-03-02 | 252 | 1
Bob | 2019-03-01 | 251 | 0
Tim | 2019-03-04 | 257 | 0
Tim | 2019-03-04 | 257 | 1
Tim | 2019-03-04 | 256 | 2
Tim | 2019-03-03 | 254 | 0
我显然可以
df %>% group_by(Name) %>% arrange(desc(Date)) %>% arrange(desc(Count))
但在那之后,我迷路了。我会以某种方式变异(改变=计数)吗?
我们可以group_by
Name
并使用lead
中的dplyr
来减去当前行中下一行的值。
library(dplyr)
df %>%
group_by(Name) %>%
mutate(Change = Count - lead(Count, default = last(Count)))
# Name Date Count Change
# <chr> <chr> <dbl> <dbl>
#1 Bob 2019-03-03 253 0
#2 Bob 2019-03-03 253 1
#3 Bob 2019-03-02 252 1
#4 Bob 2019-03-01 251 0
#5 Tim 2019-03-04 257 0
#6 Tim 2019-03-04 257 1
#7 Tim 2019-03-04 256 2
#8 Tim 2019-03-03 254 0
使用ave
的基本R方法
with(df, ave(Count, Name, FUN = function(x) c(x[-length(x)] - x[-1], 0)))
#[1] 0 1 1 0 0 1 2 0
使用来自基础R和diff
的dplyr
的解决方案。
library(dplyr)
library(tidyr)
df2 <- df %>%
group_by(Name) %>%
mutate(Change = c(-diff(Count), 0)) %>%
ungroup()
df2
# # A tibble: 8 x 4
# Name Date Count Change
# <chr> <chr> <int> <dbl>
# 1 Bob 2019-03-03 253 0
# 2 Bob 2019-03-03 253 1
# 3 Bob 2019-03-02 252 1
# 4 Bob 2019-03-01 251 0
# 5 Tim 2019-03-04 257 0
# 6 Tim 2019-03-04 257 1
# 7 Tim 2019-03-04 256 2
# 8 Tim 2019-03-03 254 0
数据
df <- read.table(text = "Name|Date|Count
Bob|'2019-03-03'|253
Bob|'2019-03-03'|253
Bob|'2019-03-02'|252
Bob|'2019-03-01'|251
Tim|'2019-03-04'|257
Tim|'2019-03-04'|257
Tim|'2019-03-04'|256
Tim|'2019-03-03'|254",
header = TRUE, stringsAsFactors = FALSE, sep = "|")
使用data.table
library(data.table)
setDT(df)[, Change := Count - shift(Count, fill = last(Count),
type = 'lead'), Name][]
# Name Date Count Change
#1: Bob 2019-03-03 253 0
#2: Bob 2019-03-03 253 1
#3: Bob 2019-03-02 252 1
#4: Bob 2019-03-01 251 0
#5: Tim 2019-03-04 257 0
#6: Tim 2019-03-04 257 1
#7: Tim 2019-03-04 256 2
#8: Tim 2019-03-03 254 0