在for循环中寻址列表对象

问题描述 投票:0回答:2

我有一个包含lm公式的列的数据框。当我为特定行[[2]]运行此列时,我得到了该LM的摘要输出。这完全有效,但由于我在该列中有959行,我想写一个for循环,以便对这些回归做一个anova。如何指定我想在for循环中寻址该列表中的所有对象?

为了让您有一个很好的理解,这里有一个MWE:

数据帧:

structure(list(Week = 7:17, Category = c("2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2"), Brand = c("3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3"), Display = c(0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0), Sales = c(0, 0, 0, 0, 13.440948, 40.097397, 
32.01384, 382.169189, 2830.748779, 4524.460938, 1053.590576), 
    Price = c(0, 0, 0, 0, 5.949999, 5.95, 5.950003, 4.87759, 
    3.787015, 3.205987, 4.898724), Distribution = c(0, 0, 0, 
    0, 1.394019, 1.386989, 1.621416, 8.209759, 8.552915, 9.692097, 
    9.445554), Advertising = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0), lnSales = c(11.4945151554497, 11.633214247508, 11.5862944141137, 
    11.5412559646132, 11.4811122484454, 11.4775106999991, 11.6333660772506, 
    11.4859819773102, 11.5232680456161, 11.5572670584292, 11.5303686934256
    ), IntrayearCycles = c(4.15446534315765, 3.62757053512638, 
    2.92387946552647, 2.14946414386239, 1.40455011205262, 0.768856938870769, 
    0.291497141953598, -0.0131078404184544, -0.162984144025091, 
    -0.200882782749248, -0.182877633924882), `Competitor Advertising` = c(10584.87063, 
    224846.3243, 90657.72553, 0, 0, 0, 2396.54212, 0, 0, 0, 40343.49444
    ), `Competitor Display` = c(0.385629, 2.108133, 2.515806, 
    4.918288, 3.81749, 3.035847, 2.463194, 3.242594, 1.850399, 
    1.751096, 1.337943), `Competitor Prices` = c(5.30989, 5.372752, 
    5.3717245, 5.3295525, 5.298393, 5.319466, 5.1958415, 5.2941095, 
    5.296757, 5.294059, 5.273578), ZeroSales = c(1, 1, 1, 1, 
    0, 0, 0, 0, 0, 0, 0)), .Names = c("Week", "Category", "Brand", 
"Display", "Sales", "Price", "Distribution", "Advertising", "lnSales", 
"IntrayearCycles", "Competitor Advertising", "Competitor Display", 
"Competitor Prices", "ZeroSales"), row.names = 1255:1265, class = "data.frame")

然后我应用for循环来估计一个误差修正模型(使用ECM包) - 这会产生一个线性模型ouptut - 。这个for循环用于估计959个单独的回归。

f <- function(.) {
  xeq <- as.data.frame(select(., lnPrice, lnAdvertising, lnDisplay, IntrayearCycles, lnCompetitorPrices, lnCompADV, lnCompDISP, ADVxDISP, ADVxCYC, DISPxCYC, ADVxDISPxCYC))
  xtr <- as.data.frame(select(., lnPrice, lnAdvertising, lnDisplay, IntrayearCycles, lnCompetitorPrices, lnCompADV, lnCompDISP, ADVxDISP,  ADVxCYC, DISPxCYC, ADVxDISPxCYC))
  print(xeq)
  print(xtr)
  summary(ecm(.$lnSales, xeq, xtr, includeIntercept = TRUE))
}


Models <- DatasetThesisSynergyClean %>% 
  group_by(Category, Brand) %>% 
  do(Model = f(.))

要查看特定模型的摘要(此处为模型2),您可以解决:

Models$model[[2]]

因此,我想从此摘要输出中提取特定值。但首先我想提取残差平方和(RSS)来做一个anova。我为一个列表对象执行此操作,如下所示:

anova_output_Unitmodels <- anova(Models$Model[[2]])
RSS_Unit <- anova_output_Unitmodels$`Sum Sq`[nrow(anova_output_Unitmodels)] #saving the RSS

现在,我希望在所有列表对象中循环,从对象[[1]]到[[959]]。这个RSS输出必须保存到最后我需要总结所有这些RSS值。

此外,如果这有效,我需要从所有模型中提取所有变量的所有系数,t值和p值。然后我还需要解决列表中的特定对象并将$ coefficient放在其后面,但我也无法管理它。

以下是我实施@Roman Lustrik答案的方法。

extractRSS <- function(x) {
  an <- anova(x)
  RSS_Unit <- an$`Sum Sq`[nrow(an)]
  return(RSS_Unit)
}

sapply(Model, FUN = extractRSS)

我也试过为一个特定的模型做这个,但这给了我一个错误:

SapplyRSS <- sapply(Models$Model, FUN = extractRSS)

我有另一个想法,并考虑以不同的方式循环它,但效果不好但它是一个开始:

如果你这样做

RSS2<- sum(Models$Model[[2]]$residuals^2) 

所以我想在for循环中复制它:

 for(i in residuals.lm){ 
  AllRSS<- as.matrix(c(1:949))
  AllRSS <- as.data.frame(AllRSS)
  SumRSS <- sum(Models$Model[[i]]$residuals^2)
  SumRSS <- as.data.frame(SumRSS)
  TotalRSS <- cbind(SumRSS, AllRSS)}

TotalRSS <- SumRSS[NULL,]

它首先在for函数中指定i,我不知道这是否正确。最终它给我留下了一个空的数据框,或者一个具有相同品牌价值的数据框。

r for-loop
2个回答
1
投票

@MichaelChirico可能有这样的想法。

extractRSS <- function(x) {
  an <- anova(x)
  RSS_Unit <- an$`Sum Sq`[nrow(an)]
  return(RSS_Unit)
}

sapply(Model, FUN = extractRSS)

sapply将遍历每个Models$Model[[i]]对象并提取RSS。您可以修改此功能以包含其他信息。结果可能会被强制转换为一些更简单的对象。你可以通过sapply(..., simplify = FALSE)来防止这种情况。


0
投票

另一种方法是将所有列表对象导出为数据框中的对象。你这样做是通过:

names(Models$Model) <- paste0("C", Models$Category, "B", Models$Brand)
list2env(Models$Model, .GlobalEnv)

然后我写了一个for循环来解决这些对象,并用for循环中的值反复填充空数据帧。具体如下:

for(X in c("0","1","3")){
  EmptyRSS <- data.frame(RSS = 0)
  ModelX <- get(paste0("C", X, "B2"))
  RSS <- sum(ModelX$residuals^2)
  RSS <- as.data.frame(RSS)
  DF <- ModelX$df[2]
  DF <- as.data.frame(DF)
  RSSDF <- cbind(RSS, DF)
  TotalRSS2 <- rbind(TotalRSS2, RSSDF)
}
TotalRSS2 <- RSSDF[NULL,]

您应该在循环外运行命令两次。

© www.soinside.com 2019 - 2024. All rights reserved.