Z3PY阈值优化导致性能差于未优化的解决方案

Question

在一个审查的问题

中，我询问了优化预测模型的决策阈值。解决方案将我带到了z3py库。我现在正在尝试与以前一样尝试类似的设置，但要优化二进制预测模型的决策阈值以最大化精度。 wo，我发现对阈值的优化会导致性能要比默认阈值差（优化器也可以选择）。我的MWP在下面（它使用固定种子随机目标和概率来复制我的发现）：

import numpy as np
from z3 import z3


def compute_eval_metrics(ground_truth, predictions):
    from sklearn.metrics import accuracy_score, f1_score

    accuracy = accuracy_score(ground_truth, predictions)
    macro_f1 = f1_score(ground_truth, predictions, average="macro")
    return accuracy, macro_f1


def optimization_acc_target(
    predictions: np.array,
    ground_truth: np.array,
    default_threshold=0.5,
):
    tp = np.sum((predictions > default_threshold) & (ground_truth == 1))
    tn = np.sum((predictions <= default_threshold) & (ground_truth == 0))

    initial_accuracy = (tp + tn) / len(ground_truth)
    print(f"Accuracy: {initial_accuracy:.3f}")

    _, initial_macro_f1_score = compute_eval_metrics(
        ground_truth, np.where(predictions > default_threshold, 1, 0)
    )

    n = len(ground_truth)
    iRange = range(n)

    threshold = z3.Real("threshold")

    opt = z3.Optimize()
    predictions = predictions.tolist()
    ground_truth = ground_truth.tolist()

    true_positives = z3.Sum(
        [
            z3.If(predictions[i] > threshold, 1, 0)
            for i in iRange
            if ground_truth[i] == 1
        ]
    )
    true_negatives = z3.Sum(
        [
            z3.If(predictions[i] <= threshold, 1, 0)
            for i in iRange
            if ground_truth[i] == 0
        ]
    )
    acc = z3.Sum(true_positives, true_negatives) / n

    # Add constraints
    opt.add(threshold >= 0.0)
    opt.add(threshold <= 1.0)

    # Maximize accuracy
    opt.maximize(acc)

    if opt.check() == z3.sat:
        m = opt.model()

        t = m[threshold].as_decimal(10)
        if type(t) == str:
            if len(t) > 1:
                t = t[:-1]
        t = float(t)
        print(f"Optimal threshold: {t}")

        optimized_accuracy, optimized_macro_f1_score = compute_eval_metrics(
            ground_truth, np.where(np.array(predictions) > t, 1, 0)
        )

        print(f"Accuracy: {optimized_accuracy:.3f} (was: {initial_accuracy:.3f})")
        print(
            f"Macro F1 Score: {optimized_macro_f1_score:.3f} (was: {initial_macro_f1_score:.3f})"
        )
        print()

    else:
        print("Failed to optimize")


np.random.seed(42)
ground_truth = np.random.randint(0, 2, size=50)
predictions = np.random.rand(50)

optimization_acc_target(
    predictions=predictions,
    ground_truth=ground_truth,
)

在我的代码中，我正在使用真正的积极和真正的负数来产生准确性。

输出为：

Accuracy: 0.600
Optimal threshold: 0.9868869366
Accuracy: 0.480 (was: 0.600)
Macro F1 Score: 0.355 (was: 0.599)

始终返回一个比

0.5

的默认阈值的解决方案。我很困惑为什么会这样？它是否至少不如默认解决方案表现出色？

为了解决这个问题，我尝试使用

z3py

（例如，在

z3.If

部分中）的构造，以为可能会导致错误的结果？但事实证明，这并没有有所作为（这是很好的，因为这与

官方的例子2相一致）。我还发现了这个问题，但这似乎与非线性constraints（我不使用的）案件有关。我现在想知道：是什么原因导致优化阈值比默认阈值更糟糕的结果？我感谢指示更多的资源和背景信息。

我找到了解决方案，很简单，恐怕：

在发布的问题中，我使用了整数部门。

z3.Sum

提出更多检查，我发现了另一个问题。这个布鲁夫我到了上面给出的麻烦的线上。

最终工作的是：

acc = z3.Sum(true_positives, true_negatives) / n

对于记录和进一步的搜索者：遵循解决方案compe not ot not：

# yes
# acc = z3.ToReal(true_positives + true_negatives) / n
# alternatively, only maxiimize TP and TN count (gives same results):
acc = true_positives + true_negatives

Z3PY阈值优化导致性能差于未优化的解决方案

问题描述投票：0回答：0

最新问题

Z3PY阈值优化导致性能差于未优化的解决方案

问题描述 投票：0回答：0

最新问题

问题描述投票：0回答：0