从 R 包(EcoSimR)使用时如何设置 na.rm = TRUE?

问题描述 投票:0回答:1

对于每个月和每个站点,我都尝试使用特定的包 (EcoSim) 来估计重叠 (RA4Model) ,如本例中所述。。但是,我发现只要给定行的所有列都为 0,循环就会因 na.rm 为 FALSE 的错误而中断。我不确定如何/在哪里可以在返回循环中设置 na.rm = TRUE。有什么建议吗?

#install.packages("EcoSimR")
library(EcoSimR)

set.seed(111)
month <- rep(c("J","J","J","F"), each = 4)
site <- rep(c("1","2","3","1"), each = 4)
species <- rep(c("A","B","C","D"), rep = 4)
q1 <- rtruncnorm(n=14, a=0, b=10, mean=0, sd=1))
q2 <- rtruncnorm(n=16, a=0, b=10, mean=1, sd=1)
q3 <- rtruncnorm(n=16, a=0, b=10, mean=0, sd=1)
q4 <- rtruncnorm(n=16, a=0, b=10, mean=1, sd=1)
q5 <- rtruncnorm(n=16, a=0, b=10, mean=1, sd=1)

df <- data.frame(month, site, species,q1,q2,q3,q4,q5)
df[1,c(4:8)] <- 0
df[6,c(4:8)] <- 0
df[15,c(4:8)] <- 0

get_eco_sim_result <- function(spd, algo= "ra4", metric = "pianka", nReps=500) {
  model = niche_null_model(speciesData = spd,
                           algo = algo,metric =metric, nReps = nReps, suppressProg = TRUE
  )
  return(list(
    Obs = model$Obs,
    Sim = mean(model$Sim, na.rm = TRUE),
    lower_1tailp = quantile(model$Sim,0.05),
    SES = (model$Obs - mean(model$Sim))/sd(model$Sim)
  ))
}

output <- do.call(
  rbind, lapply(split(df, list(month,site), drop=T), \(d) {
    data.frame(get_eco_sim_result(d[,-c(1,2,3)], nReps=5000))
  })
)

output 

当模型中有零时,它会抛出错误

Error: Error in quantile.default(model$Sim, 0.05) : 
  missing values and NaN's not allowed if 'na.rm' is FALSE 
r package mean na
1个回答
0
投票

问题是您插入的

0
行。首先,让我们生成您的数据。

library(EcoSimR)
#> Loading required package: MASS
library(truncnorm)
set.seed(111)
month <- rep(c("J","J","J","F"), each = 4)
site <- rep(c("1","2","3","1"), each = 4)
species <- rep(c("A","B","C","D"), rep = 4)
q1 <- rtruncnorm(n=16, a=0, b=10, mean=0, sd=1)
q2 <- rtruncnorm(n=16, a=0, b=10, mean=1, sd=1)
q3 <- rtruncnorm(n=16, a=0, b=10, mean=0, sd=1)
q4 <- rtruncnorm(n=16, a=0, b=10, mean=1, sd=1)
q5 <- rtruncnorm(n=16, a=0, b=10, mean=1, sd=1)

df <- data.frame(month, site, species,q1,q2,q3,q4,q5)
df[1,c(4:8)] <- 0
df[6,c(4:8)] <- 0
df[15,c(4:8)] <- 0

niche_null_model()
函数最终会调用你在
metric
中指定的函数。在您的情况下,该函数调用
pianka()
,其代码如下:

pianka
#> function (m = matrix(rpois(80, 1), nrow = 10)) 
#> {
#>     m <- m/rowSums(m)
#>     pairwise <- cbind(t(combn(nrow(m), 2)), 0)
#>     for (i in 1:nrow(pairwise)) pairwise[i, 3] <- sum(m[pairwise[i, 
#>         1], ] * m[pairwise[i, 2], ])/sqrt(sum(m[pairwise[i, 1], 
#>         ]^2) * sum(m[pairwise[i, 2], ]^2))
#>     return(mean(pairwise[, 3]))
#> }
#> <bytecode: 0x7fcb17961700>
#> <environment: namespace:EcoSimR>

注意,第一行代码对矩阵

m
进行归一化,使其在每一行中总和为 1。当您有一行所有
0
时,将每个
0
除以行总和会产生一行
NaN

s <- split(df, list(month,site), drop=T)
m <- s[[1]][,-(1:3)]
m/rowSums(m)
#>            q1         q2        q3        q4        q5
#> 13 0.03100336 0.19049927 0.1672990 0.1953235 0.4158749
#> 14 0.58478135 0.06378461 0.1619580 0.1256290 0.0638470
#> 15        NaN        NaN       NaN       NaN       NaN
#> 16 0.04609515 0.38693966 0.1419413 0.2213337 0.2036902

由于

pianka()
的代码包含一堆成对比较,任何涉及其中一个
NaN
的比较都将计算为
NaN
。以下是
pianka()
为您传入的第一个子集数据框生成的成对比较:

s <- split(df, list(month,site), drop=T)
m <- s[[1]][,-(1:3)]
m <- m/rowSums(m)
pairwise <- cbind(t(combn(nrow(m), 2)), 0)
for (i in 1:nrow(pairwise)) pairwise[i, 3] <- sum(
  m[pairwise[I, 1], ] * 
  m[pairwise[i, 2], ])/
  sqrt(sum(m[pairwise[i, 1], ]^2) * 
  sum(m[pairwise[i, 2], ]^2))

pairwise
#>      [,1] [,2]      [,3]
#> [1,]    1    2 0.3295038
#> [2,]    1    3       NaN
#> [3,]    1    4 0.8422317
#> [4,]    2    3       NaN
#> [5,]    2    4 0.3598199
#> [6,]    3    4       NaN

注意其中三个成对比较值是

NaN
pianka()
函数的返回结果是
pairwise
矩阵第3列的均值。由于它包含
NaN
值,因此平均值计算为
NaN
。注意你如何得到第四个子集数据框的结果不是
NaN
,因为矩阵没有一行零。

pianka(s[[4]][,-(1:3)])
#> [1] 0.7784348

所以,问题是,通过包含一行全零,您会导致所有模拟值都为

NaN
,无论
NaN
是否指定为论证
na.rm=TRUE
.
    

© www.soinside.com 2019 - 2024. All rights reserved.