中,我询问了优化预测模型的决策阈值。解决方案将我带到了z3py
库。
我现在正在尝试与以前一样尝试类似的设置,但要优化二进制预测模型的决策阈值以最大化精度。
wo,我发现对阈值的优化会导致性能要比默认阈值差(优化器也可以选择)。
我的MWP在下面(它使用固定种子随机目标和概率来复制我的发现):
import numpy as np
from z3 import z3
def compute_eval_metrics(ground_truth, predictions):
from sklearn.metrics import accuracy_score, f1_score
accuracy = accuracy_score(ground_truth, predictions)
macro_f1 = f1_score(ground_truth, predictions, average="macro")
return accuracy, macro_f1
def optimization_acc_target(
predictions: np.array,
ground_truth: np.array,
default_threshold=0.5,
):
tp = np.sum((predictions > default_threshold) & (ground_truth == 1))
tn = np.sum((predictions <= default_threshold) & (ground_truth == 0))
initial_accuracy = (tp + tn) / len(ground_truth)
print(f"Accuracy: {initial_accuracy:.3f}")
_, initial_macro_f1_score = compute_eval_metrics(
ground_truth, np.where(predictions > default_threshold, 1, 0)
)
n = len(ground_truth)
iRange = range(n)
threshold = z3.Real("threshold")
opt = z3.Optimize()
predictions = predictions.tolist()
ground_truth = ground_truth.tolist()
true_positives = z3.Sum(
[
z3.If(predictions[i] > threshold, 1, 0)
for i in iRange
if ground_truth[i] == 1
]
)
true_negatives = z3.Sum(
[
z3.If(predictions[i] <= threshold, 1, 0)
for i in iRange
if ground_truth[i] == 0
]
)
acc = z3.Sum(true_positives, true_negatives) / n
# Add constraints
opt.add(threshold >= 0.0)
opt.add(threshold <= 1.0)
# Maximize accuracy
opt.maximize(acc)
if opt.check() == z3.sat:
m = opt.model()
t = m[threshold].as_decimal(10)
if type(t) == str:
if len(t) > 1:
t = t[:-1]
t = float(t)
print(f"Optimal threshold: {t}")
optimized_accuracy, optimized_macro_f1_score = compute_eval_metrics(
ground_truth, np.where(np.array(predictions) > t, 1, 0)
)
print(f"Accuracy: {optimized_accuracy:.3f} (was: {initial_accuracy:.3f})")
print(
f"Macro F1 Score: {optimized_macro_f1_score:.3f} (was: {initial_macro_f1_score:.3f})"
)
print()
else:
print("Failed to optimize")
np.random.seed(42)
ground_truth = np.random.randint(0, 2, size=50)
predictions = np.random.rand(50)
optimization_acc_target(
predictions=predictions,
ground_truth=ground_truth,
)
在我的代码中,我正在使用真正的积极和真正的负数来产生准确性。
输出为:
Accuracy: 0.600
Optimal threshold: 0.9868869366
Accuracy: 0.480 (was: 0.600)
Macro F1 Score: 0.355 (was: 0.599)
始终返回一个比
0.5
的默认阈值的解决方案。我很困惑为什么会这样?它是否至少不如默认解决方案表现出色?为了解决这个问题,我尝试使用
z3py
z3.If
部分中)的构造,以为可能会导致错误的结果?但事实证明,这并没有有所作为(这是很好的,因为这与官方的例子2相一致)。我还发现了这个问题,但这似乎与非线性constraints(我不使用的)案件有关。 我现在想知道:是什么原因导致优化阈值比默认阈值更糟糕的结果?我感谢指示更多的资源和背景信息。
我找到了解决方案,很简单,恐怕:
在发布的问题中,我使用了整数部门。
z3.Sum
提出更多检查,我发现了另一个问题。这个布鲁夫我到了上面给出的麻烦的线上。
最终工作的是:
acc = z3.Sum(true_positives, true_negatives) / n
对于记录和进一步的搜索者:遵循解决方案compe not ot not:
# yes
# acc = z3.ToReal(true_positives + true_negatives) / n
# alternatively, only maxiimize TP and TN count (gives same results):
acc = true_positives + true_negatives