如何用不同颜色填充KDE曲线下的区间

Question

我正在寻找一种方法，用不同的颜色为曲线下方的间隔着色；在区间 x < 0, I would like to fill the area under the curve with one color and on the interval x >= 0 上使用另一种颜色，如下图所示：

这是基本 kde 图的代码：

fig, (ax1) = plt.subplots(1, 1, figsize = ((plot_size + 1.5) * 1,(plot_size + 1.5)))
sns.kdeplot(data=pd.DataFrame(w_contrast, columns=['contrast']), x="contrast", ax=ax1);
ax1.set_xlabel(f"Dry Yield Posterior Contrast (kg)");

有没有办法使用seaborn用不同的颜色填充曲线下的区域？

Answer 1

```
seaborn
```
是
```
matplotlib
```
的高级API，因此必须计算曲线；类似于，但比这个更简单答案。
使用
```
scipy.stats.gaussian_kde
```
使用
```
matplotlib.pyplot.fill_between
```
填充区域。
使用
```
scipy.integrate.simpson
```
计算曲线下面积，该面积将传递给
```
matplotlib.pyplot.annotate
```
进行注释。

import seaborn as sns
from scipy.stats import gaussian_kde
from scipy.integrate import simps
import numpy as np

# load sample data
df = sns.load_dataset('planets')

# create the kde model
kde = gaussian_kde(df.mass.dropna())

# plot
fig, ax = plt.subplots(figsize=(9, 6))
g = sns.kdeplot(data=df.mass, ax=ax, c='k')

# remove margins; optional
g.margins(x=0, y=0)

# get the min and max of the x-axis
xmin, xmax = g.get_xlim()

# create points between the min and max
x = np.linspace(xmin, xmax, 1000)

# calculate the y values from the model
kde_y = kde(x)

# select x values below 0
x0 = x[x < 0]

# get the len, which will be used for slicing the other arrays
x0_len = len(x0)

# slice the arrays
y0 = kde_y[:x0_len]
x1 = x[x0_len:]
y1 = kde_y[x0_len:]

# calculate the area under the curves
area0 = np.round(simps(y0, x0, dx=1) * 100, 0)
area1 = np.round(simps(y1, x1, dx=1) * 100, 0)

# fill the areas
g.fill_between(x=x0, y1=y0, color='r', alpha=.5)
g.fill_between(x=x1, y1=y1, color='b', alpha=.5)

# annotate
g.annotate(f'{area0:.0f}%', xy=(-1, 0.075), xytext=(10, 0.150), arrowprops=dict(arrowstyle="->", color='r', alpha=.5))
g.annotate(f'{area1:.0f}%', xy=(1, 0.05), xytext=(10, 0.125), arrowprops=dict(arrowstyle="->", color='b', alpha=.5))

Answer 2

大多数答案建议使用

seaborn

，但是仅使用

pandas

和

matplotlib

即可轻松完成。

# Create a dataframe with random data
df = pd.DataFrame({
    'x': np.random.randn(1000),
})

# Plot the KDE outline of the data
df['x'].plot.kde()

# Retrieve the data plotted in the last line
kde_data = plt.gca().get_lines()[-1].get_data()  

# Fill the area below the KDE curve using fill_between
plt.fill_between(kde_data[0], 0, kde_data[1], color='red', alpha=0.1)

对于您的用例，您可以使用两个不同的

fill_between

，每个都有数据的子样本。

# Create a dataframe with random data
df = pd.DataFrame({
    'x': np.random.randn(1000),
})

# Plot the KDE outline of the data
df['x'].plot.kde()

# Retrieve the data plotted in the last line
kde_data = plt.gca().get_lines()[-1].get_data()  

# Fill the area below the KDE curve using fill_between
plt.fill_between(kde_data[0][0:500], 0, kde_data[1][0:500], color='red', alpha=0.1)
plt.fill_between(kde_data[0][500:], 0, kde_data[1][500:], color='blue', alpha=0.1)

并得到以下图。

如何用不同颜色填充KDE曲线下的区间

问题描述投票：0回答：2

2个回答

最新问题

如何用不同颜色填充KDE曲线下的区间

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2