使用数据库中的图表创建 pdf 报告的过程是什么？

Question

我有一个评估大学教授的调查生成的数据库。我想要的是一个 python 脚本，它从该数据库中获取信息，为每个用户生成一个图表，为每个用户创建图表，然后将其呈现在模板中以将其导出为 pdf。

数据库是什么样子的？

User    Professor_evaluated  Category       Question    Answer
_________________________________________________________________
Mike    Professor Criss       respect           1         3
Mike    Professor Criss       respect           2         4
Mike    Professor Criss       wisdom            3         5
Mike    Professor Criss       wisdom            4         3
Charles Professor Criss       respect           1         3
Charles Professor Criss       respect           2         4
Charles Professor Criss       wisdom            3         5
Charles Professor Criss       wisdom            4         3

每位老师都有几个要评估的类别（尊重、智慧等），而每个类别都有相关的问题。换句话说，一个类别有几个问题。数据库的每一行都是学生评价老师的问题的答案

我需要什么？

我需要创建一个自动生成 pdf 报告的脚本，通过图表总结这些信息，例如一张图表是每个老师的总分，另一个图表是每个老师按类别的分数，另一个图表是每个学生的平均分等等。。最后，每个老师都会有一份报告。我想要一份这样的报告

我的问题是什么？

我的问题是我需要哪些 python 包和模块来完成这项任务。这样做的一般过程是什么。我不需要代码，因为我知道答案很笼统，但我知道怎么做。

例如：您首先需要使用 pandas 处理信息，创建一个表格来汇总您想要绘制的信息，然后绘制它，然后使用 XYZ 模块创建报告模板，然后将其导出为 pdf XYZ 模块。

Answer 1

在 python 中创建 pdf 有很多选项。其中一些选项是 ReportLab、pydf2、pdfdocument 和 FPDF。

FPDF 库使用起来相当简单，这就是我在本例中使用的库。 FPDF 文档可以在 here 找到。

考虑一下您可能想要使用哪些 python 模块来创建图形和表格可能也是一件好事。在我的示例中，我使用 matplotlib（文档链接），我还使用 Pandas 使用

pandas.dataframe()

创建数据框。

我在下面发布了一个相当冗长但完全可重现的示例，使用了 pandas、matplotlib 和 fpdf。数据是 OP 在问题中提供的数据的子集。我在我的示例中循环遍历数据框来创建表，但还有其他可能更有效的方法来执行此操作。

import pandas as pd
import matplotlib
from pylab import title, figure, xlabel, ylabel, xticks, bar, legend, axis, savefig
from fpdf import FPDF


df = pd.DataFrame()
df['Question'] = ["Q1", "Q2", "Q3", "Q4"]
df['Charles'] = [3, 4, 5, 3]
df['Mike'] = [3, 3, 4, 4]

title("Professor Criss's Ratings by Users")
xlabel('Question Number')
ylabel('Score')

c = [2.0, 4.0, 6.0, 8.0]
m = [x - 0.5 for x in c]

xticks(c, df['Question'])

bar(m, df['Mike'], width=0.5, color="#91eb87", label="Mike")
bar(c, df['Charles'], width=0.5, color="#eb879c", label="Charles")

legend()
axis([0, 10, 0, 8])
savefig('barchart.png')

pdf = FPDF()
pdf.add_page()
pdf.set_xy(0, 0)
pdf.set_font('arial', 'B', 12)
pdf.cell(60)
pdf.cell(75, 10, "A Tabular and Graphical Report of Professor Criss's Ratings by Users Charles and Mike", 0, 2, 'C')
pdf.cell(90, 10, " ", 0, 2, 'C')
pdf.cell(-40)
pdf.cell(50, 10, 'Question', 1, 0, 'C')
pdf.cell(40, 10, 'Charles', 1, 0, 'C')
pdf.cell(40, 10, 'Mike', 1, 2, 'C')
pdf.cell(-90)
pdf.set_font('arial', '', 12)
for i in range(0, len(df)):
    pdf.cell(50, 10, '%s' % (df['Question'].iloc[i]), 1, 0, 'C')
    pdf.cell(40, 10, '%s' % (str(df.Mike.iloc[i])), 1, 0, 'C')
    pdf.cell(40, 10, '%s' % (str(df.Charles.iloc[i])), 1, 2, 'C')
    pdf.cell(-90)
pdf.cell(90, 10, " ", 0, 2, 'C')
pdf.cell(-30)
pdf.image('barchart.png', x = None, y = None, w = 0, h = 0, type = '', link = '')
pdf.output('test.pdf', 'F')

预期测试.pdf:

更新（2020 年 4 月）： 我在 2020 年 4 月对原始答案进行了编辑以替换使用

pandas.DataFrame.ix()

因为这是 deprecated。在我的示例中，我能够将它的使用替换为

pandas.DataFrame.iloc

，并且输出与以前相同。

Answer 2

有点异端的答案：RMarkdown（在 RStudio 中），带有 Python 代码块，通过

reticulate

（现在的默认方式）为您提供一个长期存在的 Python“会话”，就像在 Jypiter 笔记本中一样。然后可以将 RMarkdown 文档“编织”成 PDF、html、Word、html 幻灯片，甚至 PowerPoint。

说真的，R 世界在这方面遥遥领先。

Answer 3

我同意@drz 关于 RMarkdown 创建此类报告的看法。学术著作应该清楚地使用它。反正还有stitch，用起来真的很简单，在很多情况下可能就够用了。 fpf 的许多优点：

分页管理
标记语法可用
matplotlib和pandas图直接输出
可以生成html或pdf

这是针脚中的@patrickjlong1 示例：

# Stich is simple and great

## Usefull markup language

You can use markdown syntax, such as **bold**, _italic_, ~~Strikethrough~~

## display dataframes

Direct output from python will be nicelly output.

```{python, echo=False}

import pandas as pd

df = pd.DataFrame()
df['Question'] = ["Q1", "Q2", "Q3", "Q4"]
df['Charles'] = [3, 4, 5, 3]
df['Mike'] = [3, 3, 4, 4]
df = df.set_index('Question')
df.style
df

```

## display graphics

Direct matplotlib output, without rendering to file.

```{python, echo=False}
#%matplotlib inline
df.plot.bar(title="Professor Criss's Ratings by Users")
None
```

## Symbolic expressions

You may also want to work with sympy :

```{python, echo=False}

import sympy
sympy.init_printing()
x=sympy.symbol.Symbol('x')
sympy.integrate(sympy.sqrt(1/sympy.sin(x**2)))

```

安装后，PDF 将创建为：

stitch test2.stich -o output.pdf

输出将如下所示：

Answer 4

就我而言：

连接到 Oracle 数据库并使用 cx_Oracle 库提取数据
使用 Pandas Dataframes 进行数据操作
使用 Matplotlib 生成图形
使用 ExcelWriter 和 ReportLab 输出 Excel 或 PDF 格式

希望这有帮助。

使用数据库中的图表创建 pdf 报告的过程是什么？

问题描述投票：0回答：4

4个回答

最新问题

使用数据库中的图表创建 pdf 报告的过程是什么？

问题描述 投票：0回答：4

4个回答

最新问题

问题描述投票：0回答：4