我有一个水库体积数据集(9 个水库的 9 列),我想使用 R 中的查找(或类似函数)使用评级表将体积转换为级别(我的评级表有两列,体积和水平)。如果我使用查找,它不会在评级表值之间交叉并生成一系列 N/A。
注意:评级表中的行数与数据集中的行数不同。
我的代码与此类似
存储级别1 <- lookup(storagevolume[,1],storagerating)
This code lookup the level from storagerating (two columns, volumes and levels) using the volumes from storagevolume but it does not interpolate in the storageraitng.
如何执行此操作来插入或至少在查找表中找到最接近的匹配项? 谢谢
单词级别引起了一些混乱。水位是指物理测量水位(深度)。每个水位都与特定的体积相关,例如:
level volume
0 0
1 30
2 35
3 37
4 38
由于储层的几何形状,这些关系通常不是线性的并且稍微随机跳跃。问题是当测量的电平为 3.5 时(在本例中为 37.5)如何求出音量。approx 函数可以进行线性插值。
我的解决方案:
cap <- data.frame(level = 0:4, volume = c(0, 20, 33, 35, 36))
res <- data.frame(level = c(0.5, 1.2, 2.7, 3.2))
vol <- lapply(res, function(l) approx(cap$level, cap$volume, xout=l))
vol <- vol$level
plot(cap$level, cap$volume, type="b")
points(vol$x, vol$y, col="red", pch = 19)
正如 42 评论的那样,如果我们不知道您在做什么,就很难为您提供帮助。也就是说,这段代码能给您带来任何见解吗?
storagerating <- data.frame(volumes = c(10, 100, 1000, 10000),
levels = c("A","B","C","D"))
# volumes levels
#1 10 A
#2 100 B
#3 1000 C
#4 10000 D
z <- 888 # a random volume
storagerating$levels[which.min(abs(z - storagerating$volumes))] # closest rating
#[1] C
#Levels: A B C D
编辑:矢量化解决方案
z <- round(runif(300, 1, 10000)) # a random volumes
# OPTION 1: sapply
z_levels1 <- sapply(z, function(x) storagerating$levels[which.min(abs(x - storagerating$volumes))])
z_levels1
# OPTION 2: for loop
z_levels2 <- vector("numeric",length(z))
for(i in 1:length(z)){
z_levels2[i] <- storagerating$levels[which.min(abs(z[i] - storagerating$volumes))]
}
storagerating$levels[z_levels2]
# OPTION 3: make a function with sapply
lookup <- function(x, volumes){
sapply(x, function(x) which.min(abs(x - volumes)))
}
storagerating$levels[lookup(z, storagerating$volumes)]
编辑2:插值
storagerating <- data.frame(volumes = seq.int(100,400,100),
levels = c(1:4))
storagerating # given this
# volumes levels
#1 100 1
#2 200 2
#3 300 3
#4 400 4
mod <- lm(levels ~ volumes, data = storagerating) # linear model example
df_new <- data.frame(volumes = z) # use our model results to estimate
levels_new <- predict(mod, newdata = df_new) # must be data.frame with same var name
storagerating_new <- cbind(df_new, levels_new)
head(storagerating_new); tail(storagerating_new)
# volumes levels_new
#1 1 0.01
#2 3 0.03
#3 5 0.05
#4 7 0.07
#5 9 0.09
#6 11 0.11
# volumes levels_new
#195 389 3.89
#196 391 3.91
#197 393 3.93
#198 395 3.95
#199 397 3.97
#200 399 3.99