使用 pandas crosstab 创建条形图

Question

我正在尝试使用我的数据框在seaborn中创建一个堆叠条形图。

我首先在 pandas 中生成了一个交叉表，如下所示：

pd.crosstab(df['Period'], df['Mark'])

返回：

  Mark            False  True  
Period BASELINE    583    132
       WEEK 12     721      0 
       WEEK 24     589    132 
       WEEK 4      721      0

我想使用seaborn 创建一个堆叠条形图以实现一致性，这就是我在其余图表中使用的。然而，我一直在努力做到这一点，因为我无法索引交叉表。

我已经能够使用

.plot.barh(stacked=True)

在pandas中制作我想要的情节，但seaborn没有运气。我有什么想法可以做到这一点吗？

Answer 1

正如您所说，您可以使用 pandas 来创建堆积条形图。您想要一个“seaborn 图”的论点是无关紧要的，因为每个 seaborn 图和每个 pandas 图最终都只是 matplotlib 对象，因为这两个库的绘图工具都只是 matplotlib 包装器。
这是一个完整的解决方案（使用@andrew_reece 答案中的数据创建）。

已在
python 3.8.11
、
pandas 1.3.2
、
matplotlib 3.4.3
、
seaborn 0.11.2

import numpy as np 
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

n = 500
np.random.seed(365)
mark = np.random.choice([True, False], n)
periods = np.random.choice(['BASELINE', 'WEEK 12', 'WEEK 24', 'WEEK 4'], n)

df = pd.DataFrame({'mark': mark, 'period': periods})
ct = pd.crosstab(df.period, df.mark)
    
ax = ct.plot(kind='bar', stacked=True, rot=0)
ax.legend(title='mark', bbox_to_anchor=(1, 1.02), loc='upper left')

# add annotations if desired
for c in ax.containers:
    
    # set the bar label
    ax.bar_label(c, label_type='center')

Answer 2

创建 Seaborn 的人不喜欢堆叠条形图（但该链接有一个 hack，它使用 Seaborn + Matplotlib 来制作它们）。
如果您愿意接受分组条形图而不是堆叠条形图，以下是两种方法

已在
python 3.8.11
、
pandas 1.3.2
、
matplotlib 3.4.3
、
seaborn 0.11.2

# first some sample data
import numpy as np 
import pandas as pd
import seaborn as sns

N = 1000
np.random.seed(365)
mark = np.random.choice([True, False], N)
periods = np.random.choice(['BASELINE', 'WEEK 12', 'WEEK 24', 'WEEK 4'], N)

df = pd.DataFrame({'mark':mark,'period':periods})
ct = pd.crosstab(df.period, df.mark)

mark      False  True
period               
BASELINE    124   126
WEEK 12     102   118
WEEK 24     118   133
WEEK 4      140   139

# now stack and reset
stacked = ct.stack().reset_index().rename(columns={0:'value'})

# plot grouped bar chart
p = sns.barplot(x=stacked.period, y=stacked.value, hue=stacked.mark, order=['BASELINE', 'WEEK 4', 'WEEK 12', 'WEEK 24'])
sns.move_legend(p, bbox_to_anchor=(1, 1.02), loc='upper left')

使用
```
pandas.crosstab
```
的目的是获取每组的计数，但是可以通过将原始数据帧
```
df
```
传递给
```
seaborn.countplot
```

ax = sns.countplot(data=df, x='period', hue='mark', order=['BASELINE', 'WEEK 4', 'WEEK 12', 'WEEK 24'])
sns.move_legend(ax, bbox_to_anchor=(1, 1.02), loc='upper left')

for c in ax.containers:
    
    # set the bar label
    ax.bar_label(c, label_type='center')

Answer 3

我建议使用seaborn histplot，特别是当你的分类数据有超过2个值时。

import numpy as np 
import pandas as pd
import seaborn as sns

N = 1000
np.random.seed(365)
mark = np.random.choice([True, False], N)
periods = np.random.choice(['BASELINE', 'WEEK 12', 'WEEK 24', 'WEEK 4'], N)


df = pd.DataFrame({'mark':mark,'period':periods})
group_df = df.groupby(['period', 'mark']).size().reset_index(name='count')

plt.figure(figsize=(10,6))
ax = sns.histplot(group_df, y='period', hue='mark', weights='count',
             multiple='stack', palette='colorblind')

水平直方图

使用 pandas crosstab 创建条形图

问题描述投票：0回答：3

3个回答

最新问题

使用 pandas crosstab 创建条形图

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3