我训练了
glm
如下:
fitGlm = smf.glm( listOfInModelFeatures,
family=sm.families.Binomial(),data=train, freq_weights = train['sampleWeight']).fit()
结果看起来不错:
print(fitGlm.summary())
Generalized Linear Model Regression Results
==============================================================================
Dep. Variable: Target No. Observations: 1065046
Model: GLM Df Residuals: 4361437.81
Model Family: Binomial Df Model: 7
Link Function: Logit Scale: 1.0000
Method: IRLS Log-Likelihood: -6.0368e+05
Date: Sun, 25 Aug 2024 Deviance: 1.2074e+06
Time: 09:03:54 Pearson chi2: 4.12e+06
No. Iterations: 8 Pseudo R-squ. (CS): 0.1716
Covariance Type: nonrobust
===========================================================================================
coef std err z P>|z| [0.025 0.975]
-------------------------------------------------------------------------------------------
Intercept 3.2530 0.003 1074.036 0.000 3.247 3.259
feat1 0.6477 0.004 176.500 0.000 0.641 0.655
feat2 0.3939 0.006 71.224 0.000 0.383 0.405
feat3 0.1990 0.007 28.294 0.000 0.185 0.213
feat4 0.4932 0.009 54.614 0.000 0.476 0.511
feat5 0.4477 0.005 90.323 0.000 0.438 0.457
feat6 0.3031 0.005 57.572 0.000 0.293 0.313
feat7 0.3711 0.004 87.419 0.000 0.363 0.379
===========================================================================================
然后我尝试将
summary()
导出到 .png
,如下所示:
Python:如何将 statsmodels 结果保存为图像文件?
所以,我写了这段代码:
fig, ax = plt.subplots(figsize=(16, 8))
summary = []
fitGlm.summary(print_fn=lambda x: summary.append(x))
summary = '\n'.join(summary)
ax.text(0.01, 0.05, summary, fontfamily='monospace', fontsize=12)
ax.axis('off')
plt.tight_layout()
plt.savefig('output.png', dpi=300, bbox_inches='tight')
但我收到此错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[57], line 57
55 fig, ax = plt.subplots(figsize=(16, 8))
56 summary = []
---> 57 fitGlm.summary(print_fn=lambda x: summary.append(x))
58 summary = '\n'.join(summary)
59 ax.text(0.01, 0.05, summary, fontfamily='monospace', fontsize=12)
TypeError: GLMResults.summary() got an unexpected keyword argument 'print_fn'
看起来
print_fn
未被 statsmodels 识别?
有人可以帮助我吗?
我已经设置了一个测试来查看 print_fn 可以在哪里使用。我还检查了最后一个问题发布的解决方案,但我无法在文档中找到 print_fn 。
我尝试转换为表格,以便将摘要保存为 png:
import matplotlib.pyplot as plt
import pandas as pd
# Convert the summary table to a pandas DataFrame
# change tables [0] to [1] to get the second table
summary_df = pd.read_html(model.summary().tables[0].as_html(), header=0, index_col=0)[0]
# Get the headers
headers = summary_df.columns.tolist()
# Convert the DataFrame to a list of lists and add the headers
summary_list = [headers] + summary_df.values.tolist()
# Create a new figure
fig, ax = plt.subplots()
# Remove the axes
ax.axis('off')
# Add a table to the figure
table = plt.table(cellText=summary_list, loc='center')
# Auto scale the table
table.auto_set_font_size(False)
table.set_fontsize(10)
table.scale(1, 1.5)
# Save the figure as a PNG file
plt.savefig('summary2.png', dpi=300, bbox_inches='tight')
在我看来,将数据保存为 png 是一种非常不寻常的情况。它阻止用户共享信息。有一些选项可以将摘要导出为 csv 和 Latex。 如果您手动执行此操作,我建议导出为 csv 并复制粘贴为图像。或者另存为txt甚至截图。
供参考:
model.summary().as_csv()
# save as csv
with open('summary.csv', 'w') as file:
file.write(model.summary().as_csv())
或
text = model.summary().as_text()
# save to txt
with open('summary.txt', 'w') as file:
file.write(text)