隐藏回归摘要中的一些系数,同时仍然返回调用,r平方和其他摘要输出

问题描述 投票:1回答:1

我的问题类似于这个one但是我有兴趣返回所有其他输出,而不仅仅是系数。这是示例代码,以使我的问题更清楚。

data=as.data.frame(matrix(rnorm(50*50),50,50))
summary(lm(data[,1]~.-data[,1],data=data))

我想只输出前5个系数。我知道我可以用summary(lm(data[,1]~.-data[,1],data=data))$coeff[1:5,]做到这一点,但那将摆脱我想要的所有其他输出。我也知道我可以单独获得每个输出,我只是想知道是否有一种写作单行程的简单方法并删除我不想报告的变量。

r output regression summary dummy-variable
1个回答
4
投票

您可以通过稍微修改print.summary.lm函数来选择所需的系数,summary.lm函数是R用于输出getAnywhere(print.summary.lm) 对象的汇总结果的内部函数。

首先,获取函数的代码如下:

my.rows

然后,我们需要弄清楚系数表的提取位置,并将其子集化为我们想要的行。我们将向函数添加一个新的lm(V1 ~ ., data=data)参数,然后在我们提取系数表时将这些参数子集化为这些行。修改后的函数的代码在本答案的末尾。

现在,将标准摘要与我们的新摘要进行比较。首先,我将使用真实数据创建一个模型(你提供的模型没有正确指定。看起来你想要的是m1 = lm(mpg ~ wt + hp + cyl + vs + am, data=mtcars) ,但即便如此,也没有剩余的自由度,所以我想我会用真实的数据集。):

summary(m1)

标准摘要:

Call:
  lm(formula = mpg ~ wt + hp + cyl + vs + am, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.6729 -1.6583 -0.4297  1.3307  5.4688 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 33.24160    5.48527   6.060 2.11e-06 ***
wt          -2.54332    0.93506  -2.720   0.0115 *  
hp          -0.02589    0.01387  -1.866   0.0733 .  
cyl         -0.40179    0.79364  -0.506   0.6169    
vs           1.17067    1.81283   0.646   0.5241    
am           1.97575    1.64825   1.199   0.2415    
---
  Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.537 on 26 degrees of freedom
Multiple R-squared:  0.8514,  Adjusted R-squared:  0.8228 
F-statistic:  29.8 on 5 and 26 DF,  p-value: 5.571e-10
summary

仅包含我们选择的系数的新摘要:

请注意,我们需要首先在模型上调用my.summary.lm,因为my.summary.lm(summary(m1), my.rows=2:4) 期望摘要对象,而不是模型对象本身。

my.summary.lm(summary(m1), my.rows=grep("wt|hp|cyl", names(coef(m1))))

您可能更愿意按名称选择,而不是按索引选择系数:

Call:
  lm(formula = mpg ~ wt + hp + cyl + vs + am, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.6729 -1.6583 -0.4297  1.3307  5.4688 

Coefficients:
    Estimate Std. Error t value Pr(>|t|)  
wt  -2.54332    0.93506  -2.720   0.0115 *
hp  -0.02589    0.01387  -1.866   0.0733 .
cyl -0.40179    0.79364  -0.506   0.6169  
---
  Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.537 on 26 degrees of freedom
Multiple R-squared:  0.8514,  Adjusted R-squared:  0.8228 
F-statistic:  29.8 on 5 and 26 DF,  p-value: 5.571e-10
my.rows

这是功能。我只对原始函数进行了两处更改,两者都标有内联注释。第一个变化是额外的coefs <-论证。第二个是开始my.summary.lm = function (x, digits = max(3L, getOption("digits") - 3L), symbolic.cor = x$symbolic.cor, signif.stars = getOption("show.signif.stars"), my.rows, ...) # NOTE NEW my.rows ARGUMENT { cat("\nCall:\n", paste(deparse(x$call), sep = "\n", collapse = "\n"), "\n\n", sep = "") resid <- x$residuals df <- x$df rdf <- df[2L] cat(if (!is.null(x$weights) && diff(range(x$weights))) "Weighted ", "Residuals:\n", sep = "") if (rdf > 5L) { nam <- c("Min", "1Q", "Median", "3Q", "Max") rq <- if (length(dim(resid)) == 2L) structure(apply(t(resid), 1L, quantile), dimnames = list(nam, dimnames(resid)[[2L]])) else { zz <- zapsmall(quantile(resid), digits + 1L) structure(zz, names = nam) } print(rq, digits = digits, ...) } else if (rdf > 0L) { print(resid, digits = digits, ...) } else { cat("ALL", df[1L], "residuals are 0: no residual degrees of freedom!") cat("\n") } if (length(x$aliased) == 0L) { cat("\nNo Coefficients\n") } else { if (nsingular <- df[3L] - df[1L]) cat("\nCoefficients: (", nsingular, " not defined because of singularities)\n", sep = "") else cat("\nCoefficients:\n") coefs <- x$coefficients[my.rows,] # SUBSET my.rows if (!is.null(aliased <- x$aliased) && any(aliased)) { cn <- names(aliased) coefs <- matrix(NA, length(aliased), 4, dimnames = list(cn, colnames(coefs))) coefs[!aliased, ] <- x$coefficients } printCoefmat(coefs, digits = digits, signif.stars = signif.stars, na.print = "NA", ...) } cat("\nResidual standard error:", format(signif(x$sigma, digits)), "on", rdf, "degrees of freedom") cat("\n") if (nzchar(mess <- naprint(x$na.action))) cat(" (", mess, ")\n", sep = "") if (!is.null(x$fstatistic)) { cat("Multiple R-squared: ", formatC(x$r.squared, digits = digits)) cat(",\tAdjusted R-squared: ", formatC(x$adj.r.squared, digits = digits), "\nF-statistic:", formatC(x$fstatistic[1L], digits = digits), "on", x$fstatistic[2L], "and", x$fstatistic[3L], "DF, p-value:", format.pval(pf(x$fstatistic[1L], x$fstatistic[2L], x$fstatistic[3L], lower.tail = FALSE), digits = digits)) cat("\n") } correl <- x$correlation if (!is.null(correl)) { p <- NCOL(correl) if (p > 1L) { cat("\nCorrelation of Coefficients:\n") if (is.logical(symbolic.cor) && symbolic.cor) { print(symnum(correl, abbr.colnames = NULL)) } else { correl <- format(round(correl, 2), nsmall = 2, digits = digits) correl[!lower.tri(correl)] <- "" print(correl[-1, -p, drop = FALSE], quote = FALSE) } } } cat("\n") invisible(x) } 的行:

summary.lm

更新:正如我在评论中提到的,您可以滚动自己的摘要功能并将其设置为与您经常使用的任何类型的模型摘要对象一起使用。在这种情况下,我们将包括summary.plmsummary对象,它们是分别在lmplm模型对象上运行# lm object m1 = lm(mpg ~ wt + hp + cyl + vs + am, data=mtcars) # plm object library(plm) # Example from plm help data("Produc", package = "plm") zz <- plm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp, data = Produc, index = c("state","year")) 时创建的对象类型。

首先,我们需要使用lm和plm模型对象:

summary(m1)

现在我们需要一个函数来输出我们想要的摘要数据。为了创建下面的代码,我只看了summary(zz)str(summary(m1))创建的列表对象中的内容(做str(summary(zz))call来查看这些)所以我知道在哪里可以得到摘要对象的各种元素(例如residualsprint.summary.lm)。在某些情况下,我还直接复制了print.summary.plm# Summary function that allows selection of which coefficients to include # in the coefficient table # Works with summary.lm and summary.plm objects my.summary = function(x, rows, digits=3) { # Print a few summary elements that are common to both lm and plm model summary objects cat("Call\n") print(x$call) cat("\nResiduals\n") print(summary(x$residuals)) cat("\n") print(coef(x)[rows,]) # Print elements unique to lm model summary objects if("summary.lm" %in% class(x)) { cat("\nResidual standard error:", round(x$sigma,3), "on", x$df[2], "degrees of freedom") cat(paste(c("\nF-statistic:", " on"," and"), round(x$fstatistic,2), collapse=""), "DF, p-value:", format.pval(pf(x$fstatistic[1L], x$fstatistic[2L], x$fstatistic[3L], lower.tail = FALSE), digits=digits)) # Print elements unique to plm model summary objects } else if ("summary.plm" %in% class(x)) { cat(paste("\nResidual Sum of Squares: ", signif(deviance(x), digits), "\n", sep = "")) fstat <- x$fstatistic if (names(fstat$statistic) == "F") { cat(paste("F-statistic: ", signif(fstat$statistic), " on ", fstat$parameter["df1"], " and ", fstat$parameter["df2"], " DF, p-value: ", format.pval(fstat$p.value, digits = digits), "\n", sep = "")) } else { cat(paste("Chisq: ", signif(fstat$statistic), " on ", fstat$parameter, " DF, p-value: ", format.pval(fstat$p.value, digits = digits), "\n", sep = "")) } } } 函数中的代码段。

下面的函数不会打印本机摘要函数中包含的所有内容,但应该足以向您展示如何在输出中添加您想要的任何元素。

lm

现在在plm模型和my.summary(summary(m1), 2:4) 模型上运行该函数:

Call
lm(formula = mpg ~ wt + hp + cyl + vs + am, data = mtcars)

Residuals
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-3.6730 -1.6580 -0.4297  0.0000  1.3310  5.4690 

       Estimate Std. Error    t value   Pr(>|t|)
wt  -2.54331718 0.93506164 -2.7199460 0.01148231
hp  -0.02588661 0.01387176 -1.8661377 0.07334148
cyl -0.40178727 0.79364098 -0.5062582 0.61694148

Residual standard error: 2.537 on 26 degrees of freedom
F-statistic: 29.8 on 5 and 26 DF, p-value: 5.57e-10
my.summary(summary(zz), 2:3)
Call
plm(formula = log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp, 
    data = Produc, index = c("state", "year"))

Residuals
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
-0.120500 -0.023740 -0.002041  0.000000  0.018140  0.174700 

          Estimate Std. Error  t-value      Pr(>|t|)
log(pc)  0.2920069 0.02511967 11.62463  7.075069e-29
log(emp) 0.7681595 0.03009174 25.52725 2.021455e-104

Residual Sum of Squares: 1.11
F-statistic: 3064.81 on 4 and 764 DF, p-value: <2e-16
generic function with methods

我想如果你真的想要一路走下去,你可以利用面向对象并为你想要包含的每种类型的模型编写自己的qazxswpoi。

© www.soinside.com 2019 - 2024. All rights reserved.