假设数据为:
data <- structure(list(country = c("Poland", "Poland", "Poland", "Poland",
"Poland", "Poland", "Portugal", "Portugal", "Portugal", "Portugal",
"Portugal", "Portugal", "Spain", "Spain", "Spain", "Spain", "Spain",
"Spain"), Code = c("POL", "POL", "POL", "POL", "POL", "POL",
"PRT", "PRT", "PRT", "PRT", "PRT", "PRT", "ESP", "ESP", "ESP",
"ESP", "ESP", "ESP"), year = c(1950, 1951, 1952, 1953, 1954,
1955, 1950, 1951, 1952, 1953, 1954, 1955, 1950, 1951, 1952, 1953,
1954, 1955), IV = c(3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1)), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L), class = "data.frame")
如何区分每个面板(国家)内的 IV 值?换句话说,我想用 1950 年减去 1951 年的 IV 值; 1952-1951; 1953-1952; 1954-1953; 1955-1954。对于每个国家来说也是如此。在结果数据集中(我们将其命名为“newdata”),每年的 IV 值必须显示其与前一年的 IV 值的差异; 1950年应该只是空的。
大家有什么建议吗?希望我的问题不会令人困惑。
也许在
diff
中使用 ave
。
> transform(data, dif=ave(IV, Code, FUN=\(x) c(NA, diff(x))))
country Code year IV dif
1 Poland POL 1950 3 NA
2 Poland POL 1951 3 0
3 Poland POL 1952 1 -2
4 Poland POL 1953 3 2
5 Poland POL 1954 3 0
6 Poland POL 1955 1 -2
7 Portugal PRT 1950 1 NA
8 Portugal PRT 1951 1 0
9 Portugal PRT 1952 1 0
10 Portugal PRT 1953 1 0
11 Portugal PRT 1954 1 0
12 Portugal PRT 1955 1 0
13 Spain ESP 1950 1 NA
14 Spain ESP 1951 1 0
15 Spain ESP 1952 3 2
16 Spain ESP 1953 1 -2
17 Spain ESP 1954 3 2
18 Spain ESP 1955 1 -2
数据:
set.seed(42)
data$IV <- sample(data$IV)