我试图在 R 中编写这个公式,其中 i = 类别的每个值(类别可以是 1 2 3 或 4)
这是我的代码尝试,但 R 打印此错误消息:
lm(类别 ~ (年份 * 州 * 区) + 年 + 州 + 区 + 中的错误:
形式参数“数据”与多个实际参数匹配
我正在尝试创建求和,因此我必须在数据后添加多个参数,是否有另一种方法来编写求和以避免错误消息? 我在网上查了一下,但找不到类似的东西,我猜很少会在回归中添加求和。 预先感谢您的帮助
ID <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,
17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,
33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48)
year <- c(1980,1980,1980,1980,1980,1980,1980,1980,1980,1980,1980,1980,1980,1980,1980,1980,
1981,1981,1981,1981,1981,1981,1981,1981,1981,1981,1981,1981,1981,1981,1981,1981,
1982,1982,1982,1982,1982,1982,1982,1982,1982,1982,1982,1982,1982,1982,1982,1982)
state <- c("NY","NY","NY","NY","NY","NY","NY","NY","CA","CA","CA","CA","CA","CA","CA","CA",
"NY","NY","NY","NY","NY","NY","NY","NY","CA","CA","CA","CA","CA","CA","CA","CA",
"NY","NY","NY","NY","NY","NY","NY","NY","CA","CA","CA","CA","CA","CA","CA","CA")
district <- c(1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2,
1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2,
1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2)
quantity <- c(100,200,45,87,65,32,94,52,67,72,14,53,28,94,12,41,
10,20,45,87,65,32,8,52,67,1,14,53,28,94,12,41,
1000,2000,45,87,9,32,94,5,6,7,1,5,2,9,1,4)
category <- c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,
1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,
1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4)
df <- data.frame(ID,year,state,district,quantity,category)
df$year <- as.factor(df$year)
df$state <- as.factor(df$state)
df$district <- as.factor(df$district)
df$category <- as.factor(df$category)
print(df)
# force regression baseline values
relevel(df$year, ref = '1981')
relevel(df$district, ref = '2')
# r1 is when y = 1
r1 <- lm( category ~ (year*state*district) +
quantity + district + state + year,
data = subset(df, year == 1980)
+
(year*state*district) +
quantity + district + state + year,
data = subset(df, year == 1981)
+
(year*state*district) +
quantity + district + state + year,
data = subset(df, year == 1980)
)
summary(r1)
# r2 is when y = 2
r2 <- lm( category ~ (year*state*district) +
year + state + district + quantity,
data = subset(df, year == 1980)
+
(year*state*district) +
year + state + district + quantity,
data = subset(df, year == 1981)
+
(year*state*district) +
year + state + district + quantity,
data = subset(df, year == 1980)
)
summary(r2)
then r3 and r4
这里有几个问题:
开头附近求和的行需要将系数与每一项相乘
year[i, y] 是什么意思? 没有定义。
线性回归不适用于分类响应。 假设类别是无序的,我们可以使用多项逻辑回归。
交互通常要求也包括所有低阶交互。
也许你想要这个:
library(nnet)
fm <- multinom(category ~ year/(district * state) + district + state + quantity, df)
summary(fm)
fm
属于“multinom”类对象,具有以下方法:
methods(class = "multinom")
## [1] add1 anova coef confint drop1 extractAIC
## [7] logLik model.frame predict print summary vcov
## see '?methods' for accessing help and source code
有关解释,请参阅https://stats.oarc.ucla.edu/r/dae/multinomial-logistic-regression/