我正在尝试对嵌套数据框运行多个线性回归。我有此数据示例:
data.frame(Subcat,Date, COMM1, COMM2,UOM, AUC_TYPE, WINNING_PRICE
#--|----------|-----|-----|----|---------|-------|
1, 2017-03-07, 40750,41400,"MT","English",35000
1, 2017-03-15, 40750,40000,"MT","English",35600
2, 2017-10-16, 41000,40500,"METER","Yankee",56440
2, 2017-11-06, 41010,40510,"METER","Yankee",52000
2, 2019-01-26, 50010,50510,"METER","English",50000
3, 2017-03-07, 40750,41400,"MT","English",56900
3, 2018-05-26, 50010,50510,"MT","English",47000
3, 2019-01-21, 40750,40200,"MT","English",56000
3, 2019-01-21, 40750,40200,"MT","English",55900
4, 2017-11-08, 37500,39000,"LTR","Dynamic Sealbid",67000
4, 2017-11-08, 37500,39000,"LTR","Dynamic Sealbid",65900)
因子/字符变量已转换为伪变量,然后完成了基于子类别的嵌套。
df2= df[,-2] %>% group_by(Subcat)%>% nest()
输出是带有subcat和data列的嵌套数据框。我正在尝试使用以下代码运行回归模型来预测每个子类别的获胜价格:
df2= df[,-2] %>% group_by(Subcat)%>% nest() %>%
mutate(fit=map(data, ~ lm(WINNING_PRICE~.,data = .)),
results=map(fit,augment)) %>%
unnest()
显示错误输出错误:输入必须是向量列表另外:警告消息:现在需要cols
。请使用cols = c(data, fit, results)
。此外,数据帧df2不在控制台中显示。
我已将此查询称为'Running multiple simple linear regressions from a nested dataframe/tibble'
提前感谢!
我认为这应该起作用:
model_fn <- function(df1){
lm(WINNING_PRICE ~ AUC_TYPE, data = df1)
}
fitted_bestel <- df2 %>%
mutate(fit = map(data, model_fn))
错误来自您使用的两个点(一个代替所有协变量,一个代替数据)。