我正在尝试从头开始编写逻辑回归,并得到以下错误。在进行数据清理和标记化之后,我已经使用sklearn的tfidfvectorizer从tweet标记创建了一个稀疏的tfidf矩阵。有人可以帮我吗?
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-36-98e5051d04b6> in <module>()
3 fprime=gradient,args=(x, y.values.flatten()))
4 return opt_weights[0]
----> 5 parameters = fit(X, y, theta)
3 frames
/usr/local/lib/python3.6/dist-packages/scipy/optimize/tnc.py in func_and_grad(x)
369 else:
370 def func_and_grad(x):
--> 371 f = fun(x, *args)
372 g = jac(x, *args)
373 return f, g
TypeError: cost_function() missing 1 required positional argument: 'y'
代码:
X = tfidf_train
y = train['Sentiment']
theta = np.zeros((X.shape[1], 1))
def sigmoid(x):
# Activation function used to map any real value between 0 and 1
return 1 / (1 + np.exp(-x))
def net_input(theta, x):
# Computes the weighted sum of inputs
return np.dot(x, theta)
def probability(theta, x):
# Returns the probability after passing through sigmoid
return sigmoid(net_input(theta, x))
def cost_function(self, theta, x, y):
# Computes the cost function for all the training samples
m = x.shape[0]
total_cost = -(1 / m) * np.sum(
y * np.log(probability(theta, x)) + (1 - y) * np.log(
1 - probability(theta, x)))
return total_cost
def gradient(self, theta, x, y):
# Computes the gradient of the cost function at the point theta
m = x.shape[0]
return (1 / m) * np.dot(x.T, sigmoid(net_input(theta, x)) - y)
def fit(x, y, theta):
opt_weights = fmin_tnc(func=cost_function, x0=theta,
fprime=gradient,args=(x, y.values.flatten()))
return opt_weights[0]
parameters = fit(X, y, theta)
tfidf_train.get_shape
X is <bound method spmatrix.get_shape of <89988x49526 sparse matrix of type '<class 'numpy.float64'>' with 987177 stored elements in Compressed Sparse Row format>>
y的形状为(89988,)] >>
我正在尝试从头开始编写逻辑回归,并得到以下错误。我已经使用sklearn的tfidfvectorizer在执行数据后根据推特令牌创建了一个稀疏的tfidf矩阵...
TypeError: cost_function() missing 1 required positional argument: 'y'