大熊猫中的相对数据可视化

问题描述 投票:0回答:1

我有一些数据如下:

+---------+-------+---------+----------------+
| Machine | Event | Outcome | Duration Total |
+---------+-------+---------+----------------+
| a       |     1 | FAIL    |           1127 |
| a       |     2 | FAIL    |          56099 |
| a       |     2 | PASS    |          15213 |
| a       |     3 | FAIL    |          13891 |
| a       |     3 | PASS    |          13934 |
| a       |     4 | FAIL    |           6844 |
| a       |     5 | FAIL    |           6449 |
| b       |     1 | FAIL    |          21331 |
| b       |     2 | FAIL    |          30362 |
| b       |     3 | FAIL    |          12194 |
| b       |     3 | PASS    |           7390 |
| b       |     4 | FAIL    |          35472 |
| b       |     4 | PASS    |           7731 |
| b       |     5 | FAIL    |           7654 |
| c       |     1 | FAIL    |          16833 |
| c       |     1 | PASS    |          21337 |
| c       |     2 | FAIL    |            440 |
| c       |     2 | PASS    |          14320 |
| c       |     3 | FAIL    |           5281 |
+---------+-------+---------+----------------+

我正在尝试制作每个事件和每台机器的总持续时间的分类散点图。或任何其他可视化相对分析它们。

什么是一个很好的选择,如何去做?

python pandas data-visualization categorical-data
1个回答
1
投票
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x = 'Event', y = 'Duration', hue = 'Machine', col = 'Outcome', data = df)

尝试一下这两个散点图。 X轴是事件,y轴是持续时间,点的颜色基于机器,并且有两个图形,一个用于失败,旁边是另一个用于传递。 “df”是您的数据帧。您可以删除col = 'Outcome'以在同一图表上同时包含Fail和Pass。

编辑:

fig, ax = plt.subplots(figsize = (10,10))
g = sns.scatterplot(x = 'Event', y = 'Duration', hue = 'Machine', data = df[df['Outcome'] == 'PASS'], ax = ax)
g = sns.scatterplot(x = 'Event', y = 'Duration', hue = 'Machine', data = df[df['Outcome'] == 'FAIL'], ax = ax, 
                    style = 'Machine', markers = ['x', 'x', 'x'])

handles, labels = ax.get_legend_handles_labels()
ax.legend(handles, ['Machine - Pass', 'a' ,'b', 'c', 'Machine - Fail', 'a','b','c'])

plt.show()
© www.soinside.com 2019 - 2024. All rights reserved.