我正在尝试找到曲线的切向量。曲线的方程是一个问题,我有不同的点,基于这些点我正在寻找函数的近似值,它描述曲线并拟合点。
当我绘制数据时,它看起来像这样:
应用多项式回归后(基于本文:https://www.statology.org/curve-fitting-in-r/)我得到以下结果:
fit <- lm(cl2[,3] ~ poly(cl2[,2], 3))
summary(fit)
Call:
lm(formula = cl2[, 3] ~ poly(cl2[, 2], 3))
Residuals:
Min 1Q Median 3Q Max
-0.31834 -0.10187 0.02132 0.09577 0.27393
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -109.89121 0.03789 -2900.217 < 2e-16 ***
poly(cl2[, 2], 3)1 7.33365 0.16516 44.403 < 2e-16 ***
poly(cl2[, 2], 3)2 -4.43572 0.16516 -26.857 4.25e-14 ***
poly(cl2[, 2], 3)3 1.14772 0.16516 6.949 4.66e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1652 on 15 degrees of freedom
Multiple R-squared: 0.9946, Adjusted R-squared: 0.9935
F-statistic: 913.7 on 3 and 15 DF, p-value: < 2.2e-16
当我拟合曲线时,结果看起来不错:
lines(cl2[,2], predict(fit, data.frame(cl2[,2:3])))
根据系数 s,我假设曲线方程为:
1.14x**3-4.43x**2+7.33*x-109
当我计算 y 估计值时,我得到了非常奇怪的数字:
y_实际:
[1] -108.4569 -108.1504 -108.0895 -108.0728 -108.0461 -108.1777 -108.2751 -108.4619 -108.6918 [10] -108.9750 -109.3552 -109.7625 -110.3328 -110.9580 -111.4312 -112.0062 -112.7337 -113.5880 [19]-114.3681
y_预测:
[1] -8935267 -8980331 -9044297 -9115821 -9166614 -9270340 -9355643 -9456574 -9533497 -9602089 [11]-9631113 -9670175 -9715100 -9754453 -9798813 -9851816 -9880888 -9926067 -9940310
这里出了什么问题?
我尝试将 poly 函数的原始变量设置为 TRUE,得到了不同的系数,但问题仍然存在。
数据采用
dput
格式
y_actual <-
c(-108.4569, -108.1504, -108.0895, -108.0728, -108.0461, -108.1777,
-108.2751, -108.4619, -108.6918, -108.975, -109.3552, -109.7625,
-110.3328, -110.958, -111.4312, -112.0062, -112.7337, -113.588,
-114.3681)
y_predicted <-
c(-8935267, -8980331, -9044297, -9115821, -9166614, -9270340,
-9355643, -9456574, -9533497, -9602089, -9631113, -9670175, -9715100,
-9754453, -9798813, -9851816, -9880888, -9926067, -9940310)
这是您的代码示例。因为没有提供 x 值,所以我估算了 x 值。我上面的评论是:“使用
raw=TRUE
表示非正交多项式”。
蓝线是预测函数的预测值。红线使用的是拟合系数。请注意,绿线使用相同的系数,但四舍五入为 5 位有效数字,并产生略有不同的曲线。
这是高阶方程的问题,微小的差异乘以就会变成很大的差异。一个混乱的系统。
y_actual <-
c(-108.4569, -108.1504, -108.0895, -108.0728, -108.0461, -108.1777,
-108.2751, -108.4619, -108.6918, -108.975, -109.3552, -109.7625,
-110.3328, -110.958, -111.4312, -112.0062, -112.7337, -113.588,
-114.3681)
#approximating the x values
x<- rev(seq(-204.5, -197, length.out=19))
fit <- lm(y_actual ~ poly(x, 3, raw=TRUE))
summary(fit)
# Call:
# lm(formula = y_actual ~ poly(x, 3, raw = TRUE))
#
# Residuals:
# Min 1Q Median 3Q Max
# -0.123971 -0.030200 -0.000881 0.033129 0.075555
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -1.924e+04 1.086e+04 -1.771 0.0968 .
# poly(x, 3, raw = TRUE)1 -2.537e+02 1.623e+02 -1.563 0.1390
# poly(x, 3, raw = TRUE)2 -1.099e+00 8.087e-01 -1.359 0.1942
# poly(x, 3, raw = TRUE)3 -1.546e-03 1.343e-03 -1.151 0.2678
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 0.05382 on 15 degrees of freedom
# Multiple R-squared: 0.9994, Adjusted R-squared: 0.9993
# F-statistic: 8647 on 3 and 15 DF, p-value: < 2.2e-16
y_predicted <- predict(fit, data.frame(x))
#plotting the prediction
plot(x, y_actual)
lines(x, y_predicted, col="blue", lwd=3)
#plotting the prediction with the coefficients
y_pred_2 <- fit$coefficients[4]*x**3 + fit$coefficients[3]*x**2 + fit$coefficients[2]*x + fit$coefficients[1]
lines(x, y_pred_2+0.001, col="red")
#round the coefficents
y_pred_3 <- -1.5455E-3*x**3 - 1.09927*x**2 - 253.68*x - 19238
lines(x, y_pred_3, col="green")