我有一些数据如下:
+---------+-------+---------+----------------+
| Machine | Event | Outcome | Duration Total |
+---------+-------+---------+----------------+
| a | 1 | FAIL | 1127 |
| a | 2 | FAIL | 56099 |
| a | 2 | PASS | 15213 |
| a | 3 | FAIL | 13891 |
| a | 3 | PASS | 13934 |
| a | 4 | FAIL | 6844 |
| a | 5 | FAIL | 6449 |
| b | 1 | FAIL | 21331 |
| b | 2 | FAIL | 30362 |
| b | 3 | FAIL | 12194 |
| b | 3 | PASS | 7390 |
| b | 4 | FAIL | 35472 |
| b | 4 | PASS | 7731 |
| b | 5 | FAIL | 7654 |
| c | 1 | FAIL | 16833 |
| c | 1 | PASS | 21337 |
| c | 2 | FAIL | 440 |
| c | 2 | PASS | 14320 |
| c | 3 | FAIL | 5281 |
+---------+-------+---------+----------------+
我正在尝试制作每个事件和每台机器的总持续时间的分类散点图。或任何其他可视化相对分析它们。
什么是一个很好的选择,如何去做?
import matplotlib.pyplot as plt
import seaborn as sns
sns.catplot(x = 'Event', y = 'Duration', hue = 'Machine', col = 'Outcome', data = df)
尝试一下这两个散点图。 X轴是事件,y轴是持续时间,点的颜色基于机器,并且有两个图形,一个用于失败,旁边是另一个用于传递。 “df”是您的数据帧。您可以删除col = 'Outcome'
以在同一图表上同时包含Fail和Pass。
编辑:
fig, ax = plt.subplots(figsize = (10,10))
g = sns.scatterplot(x = 'Event', y = 'Duration', hue = 'Machine', data = df[df['Outcome'] == 'PASS'], ax = ax)
g = sns.scatterplot(x = 'Event', y = 'Duration', hue = 'Machine', data = df[df['Outcome'] == 'FAIL'], ax = ax,
style = 'Machine', markers = ['x', 'x', 'x'])
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles, ['Machine - Pass', 'a' ,'b', 'c', 'Machine - Fail', 'a','b','c'])
plt.show()