我正在运行 procD.lm
R(4.0.0)包Geomorph(3.3.0)中的函数来测试模型拟合度,其中标本体形数据与任何数量的变量(如标本性别、标本种类、生境类型、标本采集年份等)相关。我还有R包RRPP(0.6.0)、Stereomorph(1.6.3)和phyloseq(1.32.0)在运行。我构建了一个geomorph.data.framework来存放我的数据(命名为gdfNEW)。
当相关变量包含分类数据(如栖息地)时,这个函数可以完美地工作,我能够生成一个方差分析表,显示每个变量的拟合优度(coords、log(Csize)和Habitat)。
fitNEW <- procD.lm(coords ~ log(Csize) * Habitat, data = gdfNEW, SS.type = "II")
anova(fitNEW)
然而,当有关变量包含数字数据时(如收集的年份、收集的月份、存在的捕食者数量、栖息地的水深等),则会出现错误信息。
fitNEW <- procD.lm(coords ~ log(Csize) * Year, data = gdfNEW, SS.type = "II")
Error: Independent variables are missing from either the data frame or global environment,
这两类变量的数据有一个区别,那就是 "分类 "数据以字符列表的形式出现在geomorph.data.frame中,而 "数字 "数据则以整数列表的形式出现。为了解决这一差异,我使用了 as.character
函数将每个整数列表改为字符列表,然后制作一个新的geomorph.data.frame,其中所有数据都以字符列表的形式呈现。
当这并没有改变错误信息时,我使用了 as.factor
函数,将所有数据改变为X级的因子(其中X对于每个变量来说都是不同的),然后构造一个新的geomorph.data.frame。这也没有改变错误信息。
我的一些变量确实有NA值,在给定标本的数据缺失的情况下,这些都反映在geomorph.data.frames中。然而,这似乎不是问题所在。所有名单的长度都是一样的。
我去年运行了同样的脚本(变量较少),但没有出现这个问题。我已经检查过了,确认不是R或包版本更新的问题。我这次一定是做了什么不一样的事情,但又想不出来是什么。
下面,我提供 "分类 "和 "数字 "两个变量数据的不完全样本。前两者取自原geomorph.data.frame "gdfNEW",其中 "Habitat "是一个字符列表,"Year "是一个整数列表。后两者取自第二个geomorph.data.frame "gdfNEW.b",其中 "Habitat "和 "Year "都是字符列表。而最后两个取自第三个geomorph.data.framework "gdfNEW.c",其中 "Habitat "是8级的因子,"Year "是34级的因子。
谢谢你的帮助 如果还有什么需要我提供的信息,请告诉我。
dput(gdfNEW$Habitat)
c("Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole",
"Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole",
"Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole",
"Marine", "Marine", "Marsh", "Marsh", "Marsh", "Marsh", "Marsh",
"Marsh", "Marsh", "Marsh", "Marsh", "Marsh", "Marsh", "Marsh",
"Marsh", "Marsh", "Marsh", "Marsh", "Marsh", "Marsh", "Marsh",
"Marsh", "Marsh", "Marsh", "Marsh", "Marsh", "Marsh", "Marsh")
dput(gdfNEW$Year)
c(2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 2000L,
2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 2000L,
2000L, 2000L, 2000L, 2000L, 1945L, 1945L, 1945L, 1945L, 1945L,
1945L, 1945L, 1945L, 1945L, 1945L, 1945L, 1945L, 1945L, 1945L,
1945L, 1945L, 1945L, 1945L, 1945L, 1945L, 1945L, 1945L, 1945L,
1945L, 1945L, 1945L, 1999L, 1999L, 1999L, 1999L, 1999L, 1999L)
dput(gdfNEW.b$Habitat)
c("Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole",
"Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole",
"Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole",
"Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole", "Sinkhole", "Creek",
"Creek", "Creek", "Creek", "Creek", "Creek", "Creek", "Creek",
"Creek", "Creek", "Creek", "Creek", "Creek", "Creek", "Creek")
dput(gdfNEW.b$Year)
c("2000", "2000", "2000", "2000", "2000", "2000", "2000", "2000",
"2000", "2000", "2000", "2000", "2000", "2000", "2000", "2000",
"2000", "2000", "2000", "2000", "2000", "2000", "1945", "1945",
"1945", "1945", "1945", "1945", "1945", "1945", "1945", "1945",
"1945", "1945", "1945", "1945", "1945", "1945", "1945", "1945",
"1945", "1945", "1945", "1945", "1945", "1945", "1945", "1945")
dput(gdfNEW.c$Habitat)
structure(c(8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L), .Label = c("Cienega",
"Creek", "HotMarsh", "Lake", "Marine", "Marsh", "River", "Sinkhole"
), class = "factor")
dput(gdfNEW.c$Year)
structure(c(33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L,
33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 12L,
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L,
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 32L,
32L, 32L, 32L, 32L, 32L, 32L, 32L, 32L, 32L, 32L, 32L, 32L, 32L), .Label =
c("1926", "1927", "1930", "1931", "1932", "1935",
"1938", "1939", "1940", "1941", "1942", "1945", "1947", "1950",
"1951", "1955", "1958", "1959", "1960", "1961", "1966", "1967",
"1968", "1972", "1975", "1976", "1987", "1989", "1990", "1996",
"1997", "1999", "2000", "2003"), class = "factor")