以下代码根据我传递的输入顺序生成不同的结果。这是为什么?我希望最小二乘优化能够达到相同的结果,无论输入以什么顺序传递给错误生成函数。到底是什么造成了这种差异?
import numpy as np
from scipy.optimize import least_squares
def dummy_func(Y, T):
# Define fn to optimize
def func(X):
tau = X[0]
C1 = X[1]
C2 = X[2]
C3 = X[3]
C4 = X[4]
L0 = X[5]
S1 = X[6]
exp_tt1 = np.exp(- T / tau)
f1 = (1 - exp_tt1) / (T / tau)
f2 = 2 * f1 - exp_tt1 * (T / tau)
f3 = 3 * f2 - exp_tt1 * (T / tau) ** 2
f4 = 4 * f3 - exp_tt1 * (T / tau) ** 3
f5 = 5 * f4 - exp_tt1 * (T / tau) ** 4
y_hat = L0 + S1 * f1 + C1 * f2 + C2 * f3 + C3 * f4 + C4 * f5
errs = Y - y_hat
return errs
return func
# starting point and boundaries
initial_params = np.array([ 10, 0, 0, 0, 0, 100, 0])
bounds_lower = np.array([ 3, -500, -1e3, -100, -15, 0, -1e3])
bounds_upper = np.array([ 15, 500, 1e3, 100, 15, 1e5, 1e3])
# inputs
Y = np.array([5, 6, 7])
T = np.array([1, 1.5, 1.6])
# Get min function
min_fn = dummy_func(Y, T)
# Fit
res = least_squares(min_fn,
initial_params,
bounds=(bounds_lower, bounds_upper),
ftol=1e-3,
loss="linear",
verbose=0)
# swap items 1 and 2
T_swap = T.copy()
Y_swap = Y.copy()
ix1 = 1
ix2 = 2
T_swap[[ix1, ix2]] = T[[ix2, ix1]]
Y_swap[[ix1, ix2]] = Y[[ix2, ix1]]
# Get min function
min_fn_swap = dummy_func(Y_swap, T_swap)
# Fit
res_swap = least_squares(min_fn_swap,
initial_params,
bounds=(bounds_lower, bounds_upper),
ftol=1e-3,
loss="linear",
verbose=0)
print(f"same results? {all(res.x == res_swap.x)} \n res.x: {res.x} \nres_swap.x: {res_swap.x}")
这会产生以下输出:
same results? False
res.x: [ 3.00001277 310.88942015 300.69037515 -40.34027084 -9.85586819
43.96214131 -290.54867755]
res_swap.x: [ 10.28332721 86.26342455 -67.14921168 8.76243152 0.51657765
93.50701816 -133.10992102]
这个问题有很多最小值。因此,数据的顺序使得参数X仅通过最小化算法的内部工作就收敛到不同的点。由于 X 参数的数量 (7) 大于数据的数量 (3),因此预计会有很多最小值。我建议您将容差降低到默认值(1e-08)并打印结果以获得更多见解:
print(res)
message: `xtol` termination condition is satisfied.
success: True
status: 3
fun: [-1.364e-12 -1.478e-12 3.297e-12]
x: [ 3.011e+00 3.578e+02 1.724e+02 -2.350e+01 -6.728e+00
1.273e+02 -4.763e+02]
cost: 7.45755021004839e-24
...
print(res_swap)
message: `xtol` termination condition is satisfied.
success: True
status: 3
fun: [ 6.935e-03 3.849e-02 -4.544e-02]
x: [ 3.000e+00 2.502e+02 1.367e+02 -7.926e+00 -5.857e+00
2.139e+02 -6.154e+02]
cost: 0.001797285633783842
原始数据的成本接近 0 柱数字舍入。由于零是成本的最低可能值,因此这是全局最小值。然而,交换数据的成本大于原始数据的成本。在第二种情况下,最小化收敛到局部最小值。
尝试不同的初始参数,你甚至会达到其他最小值。例如,我尝试过:
initial_params = np.array([ 10, 0, 0, 0, 0, 1, 0])
并获得
print(res)
message: `xtol` termination condition is satisfied.
success: True
status: 3
fun: [ 1.846e-03 -1.206e-02 1.021e-02]
x: [ 3.000e+00 2.808e+02 2.788e+02 -2.415e+01 -1.153e+01
5.636e+01 -3.003e+02]
cost: 0.00012653248779190626
因此,通过改变一些条件,最小化将收敛到非常不同的 X。为了获得一个最小值,需要比参数更多的数据。