我正在使用this数据集练习简单的线性回归,这是我的参数:
sat = np.array(data['SAT'])
gpa = np.array(data['GPA'])
theta_0 = 0.01
theta_1 = 0.01
alpha = 0.003
cost = 0
m = len(gpa)
我试图通过将成本函数计算转化为矩阵并执行逐元素运算来优化成本函数计算。这是我想出的结果公式:
成本函数优化:
成本函数
def calculateCost(matrix_x,matrix_y,m):
global theta_0,theta_1
cost = (1 / (2 * m)) * ((theta_0 + (theta_1 * matrix_x) - matrix_y) ** 2).sum()
return cost
我也尝试对梯度下降做同样的事情。
梯度下降
def gradDescent(alpha,matrix_x,matrix_y):
global theta_0,theta_1,m,cost
cost = calculateCost(sat,gpa,m)
while cost > 1
temp_0 = theta_0 - alpha * (1 / m) * (theta_0 + theta_1 * matrix_x - matrix_y).sum()
temp_1 = theta_1 - alpha * (1 / m) * (matrix_x.transpose() * (theta_0 + theta_1 * matrix_x - matrix_y)).sum()
theta_0 = temp_0
theta_1 = temp_1
我不确定两个实现是否正确。该实现返回的成本为114.89379821428574,这在某种程度上显示了我绘制成本时“下降”的样子:
梯度下降图:
<< img src =“ https://image.soinside.com/eyJ1cmwiOiAiaHR0cHM6Ly9pLnN0YWNrLmltZ3VyLmNvbS8wY1RlYS5wbmcifQ==” alt =“梯度下降图”>
如果我已正确实现成本函数和梯度下降,请纠正我,并提供可能的解释,因为我仍然是多变量演算的初学者。谢谢。
该代码有很多问题。
首先,错误背后的两个主要问题:
1)行
temp_1 = theta_1 - alpha * (1 / m) * (matrix_x.transpose() * (theta_0 + theta_1 * matrix_x - matrix_y)).sum()
特别是矩阵乘法matrix_x.transpose() * (theta_0 + ...)
。 *
运算符进行逐元素相乘,结果结果为20x20
大小,您期望的梯度为1x1
大小(更新单个实变量theta_1
时。) >
2]梯度计算中的while cost>1:
条件。您永远不会在循环中更新成本...
这是您的代码的有效版本:
import numpy as np
import matplotlib.pyplot as plt
sat=np.random.rand(40,1)
rand_a=np.random.randint(500)
rand_b=np.random.randint(400)
gpa=rand_a*sat+rand_b
theta_0 = 0.01
theta_1 = 0.01
alpha = 0.1
cost = 0
m = len(gpa)
def calculateCost(matrix_x,matrix_y,m):
global theta_0,theta_1
cost = (1 / 2 * m) * ((theta_0 + (theta_1 * matrix_x) - matrix_y) ** 2).sum()
return cost
def gradDescent(alpha,matrix_x,matrix_y,num_iter=10000,eps=0.5):
global theta_0,theta_1,m,cost
cost = calculateCost(sat,gpa,m)
cost_hist=[cost]
for i in range(num_iter):
theta_0 -= alpha * (1 / m) * (theta_0 + theta_1 * matrix_x - matrix_y).sum()
theta_1 -= alpha * (1 / m) * (matrix_x.transpose().dot(theta_0 + theta_1 * matrix_x - matrix_y)).sum()
cost = calculateCost(sat,gpa,m)
cost_hist.append(cost)
if cost<eps:
return cost_hist
if __name__=="__main__":
print("init_cost==",cost)
cost_hist=gradDescent(alpha,sat,gpa)
print("final_cost,num_iters",cost,len(cost_hist))
print(rand_b,theta_0,rand_a,theta_1)
plt.plot(cost_hist,linewidth=5,color="r");plt.show()