Python-使用梯度下降求解Ax = b

问题描述 投票:0回答:1

想求解Ax = b,找到x,已知矩阵A(nxn和b nx1,A是五对角矩阵,尝试使用不同的n。您可以在这里看到如何设置它们:enter image description here

enter image description here

我想使用梯度下降来求解线性系统。我看到使用此方法求解Ax=b本质上是在尝试最小化二次函数

f(x) = 0.5*x^t*A*x - b^t*x.

我看到一个Wikipedia示例

f(x)=x^{4}-3x^{3}+2 

将类似于:

next_x = 6  # We start the search at x=6
gamma = 0.01  # Step size multiplier
precision = 0.00001  # Desired precision of result
max_iters = 10000  # Maximum number of iterations

# Derivative function
def df(x):
    return 4 * x ** 3 - 9 * x ** 2


for _i in range(max_iters):
    current_x = next_x
    next_x = current_x - gamma * df(current_x)

    step = next_x - current_x
    if abs(step) <= precision:
        break

print("Minimum at ", next_x)

# The output for the above will be something like
# "Minimum at 2.2499646074278457"

所以可以将return替换为0.5*(A+A.T.conj())*x - b(这是f(x) = 0.5*x^t*A*x - b^t*x的派生词(它本身就是我们为Ax = b获得的函数)。我尝试了此操作,但没有得到正确的结果为x。您可以在这里查看我的完整代码:

import time
import numpy as np
from math import sqrt
from scipy.linalg import solve_triangular

import math

start_time = time.time()




n=100

################## AAAAA matrix #############################################
A = np.zeros([n, n], dtype=float)  # initialize to f zeros

# ------------------first row
A[0][0] = 6
A[0][1] = -4
A[0][2] = 1
# ------------------second row
A[1][0] = -4
A[1][1] = 6
A[1][2] = -4
A[1][3] = 1
# --------------two last rows-----
# n-2 row
A[- 2][- 1] = -4
A[- 2][- 2] = 6
A[- 2][- 3] = -4
A[- 2][- 4] = 1
# n-1 row
A[- 1][- 1] = 6
A[- 1][- 2] = -4
A[- 1][- 3] = 1

# --------------------------- from second to n-2 row --------------------------#
j = 0
for i in range(2, n - 2):
    if j == (n - 4):
        break
    A[i][j] = 1
    j = j + 1

j = 1
for i in range(2, n - 2):
    if j == (n - 3):
        break
    A[i][j] = -4
    j = j + 1

j = 2
for i in range(2, n - 2):
    if j == (n - 2):
        break
    A[i][j] = 6
    j = j + 1

j = 3
for i in range(2, n - 2):
    if j == (n - 1):
        break
    A[i][j] = -4
    j = j + 1

j = 4
for i in range(2, n - 2):
    if j == (n):
        break
    A[i][j] = 1
    j = j + 1
# -----------------------------end coding of 2nd to n-2 r-------------#
print("\nMatrix A is : \n", A)

####### b matrix ######################################
b = np.zeros(n,float).reshape((n,1))
b[0] = 3
b[1] = -1
#b[len(b) - 1] = 3
#b[len(b) - 2] = -1
b[[0,-1]]=3; b[[1,-2]]=-1

############ init x #####################
x = np.zeros(n,float).reshape((n,1))
#x = [0] * n
#x = np.zeros([n, 1], dtype=float)
print("\n x is ",x)

print("\nMatrix b is \n", b)
#####################################



# Derivative function
def df(x):
    a = 0.5 * (A + np.transpose(A))
    res = np.dot(a, x) - b
    return res

def steep(A,b,x):
    next_x = 6  # We start the search at x=6
    gamma = 0.01  # Step size multiplier
    precision = 0.00001  # Desired precision of result
    max_iters = 10000  # Maximum number of iterations


    for _i in range(max_iters):
        current_x = next_x
        next_x = current_x - gamma * df(current_x)

        step = next_x - current_x
        ass=abs(step)
        if ass.any() <= precision:
            break

    print("Minimum at ", next_x)
    return next_x

myx=steep(A,b,x)

print("\n myx is ",myx)




print("--- %s seconds ---" % (time.time() - start_time))
python numpy linear-regression linear-algebra gradient-descent
1个回答
1
投票

首先,简化系数矩阵的一种方法可能是通过指定对角线np.diag而不是使用多个np.diag循环来使用k函数,因此以for为例]

n=10

现在梯度下降指出,作为系统import numpy as np n = 10 A = np.diag(np.ones(n-2), k=-2) + np.diag(-4*np.ones(n-1), k=-1) + \ np.diag(6*np.ones(n), k=0) + \ np.diag(-4*np.ones(n-1), k=1) + np.diag(np.ones(n-2), k=2) b = np.zeros(n) b[0] = 3 ; b[1] = -1 ; b[n-1] = 3 ; b[n-2] = -1 的解决方案的向量

x_sol,其中Ax=b正定对称”是二次方的最小值表格
A

因此请确保在此函数def f(A,b,x) : return 0.5*np.dot(np.dot(x,A),x)-np.dot(x,b) 的梯度评估np.dot中执行正确的矩阵乘法(使用@*),而不是python逐元素乘法(使用df)。此外,请注意,由于fsymmetric

,因此A可以简化,并且仅返回df而不是np.dot(A,x)-b。您还需要指定正确的停止标准,例如通过相对于您的公差系数0.5*np.dot(A.T,x) + 0.5*np.dot(A,x)-b测试步距的欧几里得范数。让我们运行这个
tol

给出

# parameters
gamma = 0.01          # step size multiplier
tol = 1e-9            # stopping criterion
max_iters = 1e6       # maximum number of iterations

# your initial vector x guess
next_x = 6*np.ones(n)  # here, you chose 6 

# derivative function
def df(A,b,x):
    return np.dot(A,x)-b

i=1
cvg = False           # convergence flag
while i <= max_iters:
    curr_x = next_x
    next_x = curr_x - gamma * df(A,b,curr_x)

    step = next_x - curr_x
    if np.linalg.norm(step,2) <= tol:
        cvg = True
        break
    i += 1

if cvg :
    print('Minimum reached in ' + str(i) + ' iterations.')
    print('x=',next_x)
else :
    print('No convergence for max_iters.')

让我们用>>> Minimum reached in 61931 iterations. >>> x= [[1.0000003 ] [1.00000076] [1.00000125] [1.00000165] [1.00000187] [1.00000187] [1.00000165] [1.00000125] [1.00000076] [1.0000003 ]] 进行检查:

np.linalg.solve

[另外,如果不是必须执行梯度下降,则将使用np.allclose(np.linalg.solve(A,b),next_x) >>> True 。希望这会有所帮助。

© www.soinside.com 2019 - 2024. All rights reserved.