如何使用 Pyomo 对函数的输出添加约束

问题描述 投票:0回答:1

我目前正在开发一个专门的求解器,旨在调整 SARIMAX(具有 eXogenous 回归量的季节性自回归综合移动平均线)模型的超参数。该项目的主要目标是确保模型的准确性保持在预定阈值以上。为了实现这一目标,我利用 Pyomo(一种强大的优化建模工具)作为我的求解器来帮助确定满足此精度标准所需的超参数的最佳值。

我的工作重点是生成两个关键性能指标的输出:train_r2 和 test_r2。通过确保这两个指标都超过特定范围,我可以自信地确定所选超参数是否足以有效地构建稳健且可靠的预测模型。 R² 分数表示可从自变量预测的因变量方差的比例,对于评估模型的性能至关重要。

虽然 SARIMAX 模型已成功创建,但我仍面临着满足所设置的约束的问题。我尝试了各种求解器,包括 glpk、ipopt 和 cbc,以找到合适的解决方案。尽管输出表明已找到成功的解决方案,但似乎仍然违反了约束

我目前正在调查此问题的潜在原因,其中可能包括重新评估超参数的界限、检查目标函数的制定或调整约束以确保它们正确实施。最终目标是完善超参数调整过程,以便我能够实现 SARIMAX 模型,该模型不仅在准确性方面表现良好,而且满足训练和测试数据集的预定义阈值。

import numpy as np
import pandas as pd
from pyomo.environ import *
from statsmodels.tsa.statespace.sarimax import SARIMAX
from sklearn.metrics import r2_score

# Step 1: Create a synthetic time series dataset
np.random.seed(42)
n = 120  # Length of time series data

# Generate a seasonal pattern with trend and noise
time = np.arange(n)
seasonal_pattern = 10 * np.sin(2 * np.pi * time / 12)
trend = 0.1 * time
noise = np.random.normal(0, 1, n)
data = 20 + trend + seasonal_pattern + noise

# Create a DataFrame
df = pd.DataFrame(data, columns=['value'])
df['date'] = pd.date_range(start='2010-01-01', periods=n, freq='D')
df.set_index('date', inplace=True)

# Step 2: Split into train and test sets
train_size = int(len(df) * 0.8)
train, test = df.iloc[:train_size], df.iloc[train_size:]

# Step 3: Define the Pyomo model
model = ConcreteModel()

# Define integer variables for SARIMA parameters
model.p = Var(within=NonNegativeIntegers, bounds=(0, 10), initialize=1)
model.d = Var(within=NonNegativeIntegers, bounds=(0, 10), initialize=1)
model.q = Var(within=NonNegativeIntegers, bounds=(0, 10), initialize=1)
model.P = Var(within=NonNegativeIntegers, bounds=(0, 10), initialize=1)
model.D = Var(within=NonNegativeIntegers, bounds=(0, 10), initialize=1)
model.Q = Var(within=NonNegativeIntegers, bounds=(0, 10), initialize=1)
model.seasonality = Var(within=NonNegativeIntegers, bounds=(2, 12), initialize=2)
print("Displaying P value ",model.p.display())

# Define parameters to store R² scores
model.train_r2 = Var(within=NonNegativeReals)
model.test_r2 = Var(within=NonNegativeReals)
model.test_train_sum = Var(within=NonNegativeReals)


# Step 4: Define a function to fit SARIMAX and calculate R² values
def sarima_r2_model(p, d, q, P, D, Q, seasonality):
    try:
        # Fit SARIMAX model with the given parameters
        model = SARIMAX(
            train['value'],
            order=(p, d, q),
            seasonal_order=(P, D, Q, seasonality),
            enforce_stationarity=False,
            enforce_invertibility=False
        )
        results = model.fit(disp=False)

        # Calculate R² on both train and test sets
        train_pred = results.predict(start=train.index[0], end=train.index[-1])
        test_pred = results.predict(start=test.index[0], end=test.index[-1])
        train_r2_value = r2_score(train['value'], train_pred)
        test_r2_value = r2_score(test['value'], test_pred)
        return train_r2_value, test_r2_value
    except:
        print("========== If model fitting fails, return low R² values")
        return -10, -10

# Define objective to maximize sum of R² scores
def objective_rule(m):
    p = int(m.p.value)
    d = int(m.d.value)
    q = int(m.q.value)
    P = int(m.P.value)
    D = int(m.D.value)
    Q = int(m.Q.value)
    seasonality = int(m.seasonality.value)
    m.train_r2.value, m.test_r2.value = sarima_r2_model(p, d, q, P, D, Q, seasonality)
    
    # Store R² values in model variables for constraints

    print("train_r2_value ==========",m.train_r2.value)
    print("test_r2_value ==========",m.test_r2.value)
    model.test_train_sum = m.train_r2.value + m.test_r2.value 
    return model.test_train_sum


# Set objective function
model.objective = Objective(rule=objective_rule, sense=maximize)



# Step 5: Add constraints for minimum R² threshold
r2_threshold = 0.8
model.train_r2_constraint = Constraint(expr=model.train_r2 >= r2_threshold)
model.test_r2_constraint = Constraint(expr=model.test_r2>=r2_threshold)


# Step 6: Solve the optimization problem with a suitable solver
solver = SolverFactory('cbc',executable="/usr/bin/cbc")  # Choose solver like 'cbc' or 'glpk' if available


results = solver.solve(model, tee=True)
print(results.write())
print(results.solver.status)
print("optimal",results.solver.termination_condition)

print("# will show you the evaluation of the OBJ function")
model.objective.display()  
print(model.objective.expr()) 

# Step 7: Retrieve the best parameters
best_params = {
    'p': int(model.p.value),
    'd': int(model.d.value),
    'q': int(model.q.value),
    'P': int(model.P.value),
    'D': int(model.D.value),
    'Q': int(model.Q.value),
    'seasonality': int(model.seasonality.value)
}
print("Optimized parameters:", best_params)

# Step 8: Train the final model with optimized parameters
best_model = SARIMAX(
    train['value'],
    order=(best_params['p'], best_params['d'], best_params['q']),
    seasonal_order=(best_params['P'], best_params['D'], best_params['Q'], best_params['seasonality']),
    enforce_stationarity=False,
    enforce_invertibility=False
)
best_results = best_model.fit(disp=False)

# Step 9: Final evaluation on train and test sets
train_pred = best_results.predict(start=train.index[0], end=train.index[-1])
test_pred = best_results.predict(start=test.index[0], end=test.index[-1])

print("Final Train R²:", r2_score(train['value'], train_pred))
print("Final Test R²:", r2_score(test['value'], test_pred))
optimization time-series solver pyomo sarimax
1个回答
0
投票

这是第一篇文章,有几件事......

这个问题(以及未来的问题)需要更多的关注才能在这个网站上获得好的答案。 “我构建了这个大东西,但有些东西不起作用”这样的大问题(比如这个)往往不会像关注更具体问题(使用可重现的代码)的问题那样受到关注。 你有代码(很好),但没有太多关注。

这里的问题是你没有正确使用 pyomo。 Pyomo 是一个构建数学模型的框架,然后将其单独移交给求解器(您在问题中声明 pyomo 是一个求解器 - 这是不正确的,可能是滥用的一部分。)

因此,pyomo 需要根据未知的变量形成一组方程并将其发送给求解器进行分析。 您错误地传递了 SARIMAX 模型(问题 #1),并且还尝试确定目标中的变量值(问题 #2)。

也许一个可以帮助指导您的概念是寻找“黑盒求解器”框架,您可以在其中传递任何任意函数并让它评估并尝试更好的结果 - 这是与 pyomo 不同的框架。 我相信 scikit 中有一个。 YMMV.

© www.soinside.com 2019 - 2024. All rights reserved.