在 matplotlib 中对齐不同曲线进行比较

Question

我正在开展一个项目，以将不同算法的执行时间与其在 Python 中的理论时间复杂度进行比较。具体来说，我正在测试素数检查函数的三个版本，每个版本都有不同的理论复杂性：

v1：O(n)
v2：O(n/2)
v3：O(sqrt(n))

对于每个算法，我在同一张图上绘制了实际执行时间和理论时间复杂度。我的目标是在视觉上将理论曲线的形状和比例与经验执行时间相匹配。然而，我面临着正确对齐这些曲线的问题，特别是单独缩放每条理论复杂度曲线。

# Sample data
test_list = np.array([...])  # Input sizes
times_v1 = np.array([...])  # Execution times for v1
times_v2 = np.array([...])  # Execution times for v2
times_v3 = np.array([...])  # Execution times for v3


# For v1: O(n)
C_v1 = times_v1[0] / test_list[0]
theoretical_v1 = [(x)/C_v1 for x in test_list]

# For v2: O(n/2)
C_v2 = times_v2[0] / (test_list[0] / 2)
theoretical_v2 = [(x/2)/C_v1 for x in test_list]

# For v3: O(√n)
C_v3 = times_v3[0] / test_list[0]
# This should correctly represent O(√n)
theoretical_v3 = [np.sqrt(x)/C_v1 for x in test_list]
# Plotting
plt.figure(figsize=(14, 8))

# Actual execution times
plt.plot(test_list, times_v1, 'o-', label="v1: O(n)", color="blue")

# Theoretical complexities
plt.plot(test_list, theoretical_v1, '--',
         label="Theoretical O(n)", color="blue")


plt.xlabel("Input Size (n)")
plt.ylabel("Time (ms)")
plt.title("Prime Number Check: Execution Time vs Theoretical Complexity")
plt.legend()
plt.show()

plt.plot(test_list, times_v2, 'o-', label="v2: O(n/2)", color="green")
plt.plot(test_list, theoretical_v2, '--',
         label="Theoretical O(n/2)", color="green")
plt.xlabel("Input Size (n)")
plt.ylabel("Time (ms)")
plt.title("Prime Number Check: Execution Time vs Theoretical Complexity")
plt.legend()
plt.show()

plt.plot(test_list, times_v3, 'o-', label="v3: O(√n)", color="red")
plt.plot(test_list, theoretical_v3, '--',
         label="Theoretical O(√n)", color="red")
plt.xlabel("Input Size (n)")
plt.ylabel("Time (ms)")
plt.title("Prime Number Check: Execution Time vs Theoretical Complexity")
plt.legend()
plt.show()

当我运行上面的代码时，理论曲线和经验曲线之间的对齐并不像我使用固定常数（如 9999、7532 和 1000）来缩放理论值时那样一致。从初始数据点计算出的动态缩放因子似乎并不能完全捕捉总体趋势，尤其是随着输入大小的增长。

有没有一种方法可以在整个输入范围内动态地一致地缩放理论曲线，而无需求助于硬编码常数？或者是否有处理此类扩展问题的标准做法？

Answer 1

没有“标准”方法，但您可以对曲线执行最小绝对误差拟合以获得，

theoretical_scaled = a * theoretical

，使其最接近

actual

...

import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt


def objective_function(params, actual, theoretical):
    a = params
    return np.sum(np.abs(actual - (a * theoretical)))


x = np.array([1, 2, 3, 4, 5])
actual = np.array([1, 4, 7, 8, 10])
theoretical = np.array([1, 2, 3, 4, 5])

initial_guess = np.array([1])
result = minimize(objective_function, initial_guess, args=(actual, theoretical))
a_opt = result.x

theoretical_scaled = a_opt * theoretical

plt.plot(x, actual, label="actual")
plt.plot(x, theoretical_scaled, label="theoretical")
plt.legend()
plt.show()

有一些关于不同标准化方法和规范的书籍可以用来使其更具“吸引力”，但这看起来不错，我更喜欢这里的 L1 规范，因为它允许某些值不正确，而 L2 规范很容易受到异常值的影响。

您还可以使用不同的拟合函数，例如您可以将其设为

theoretical_scaled = a * theoretical + b

，其中

是偏移量，等等..

在 matplotlib 中对齐不同曲线进行比较

问题描述投票：0回答：1

1个回答

最新问题

在 matplotlib 中对齐不同曲线进行比较

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1