我正在寻找一种方法,用不同的颜色为曲线下方的间隔着色;在区间 x < 0, I would like to fill the area under the curve with one color and on the interval x >= 0 上使用另一种颜色,如下图所示:
这是基本 kde 图的代码:
fig, (ax1) = plt.subplots(1, 1, figsize = ((plot_size + 1.5) * 1,(plot_size + 1.5)))
sns.kdeplot(data=pd.DataFrame(w_contrast, columns=['contrast']), x="contrast", ax=ax1);
ax1.set_xlabel(f"Dry Yield Posterior Contrast (kg)");
有没有办法使用seaborn用不同的颜色填充曲线下的区域?
seaborn
是matplotlib
的高级API,因此必须计算曲线;类似于,但比这个更简单答案。scipy.stats.gaussian_kde
matplotlib.pyplot.fill_between
填充区域。scipy.integrate.simpson
计算曲线下面积,该面积将传递给 matplotlib.pyplot.annotate
进行注释。import seaborn as sns
from scipy.stats import gaussian_kde
from scipy.integrate import simps
import numpy as np
# load sample data
df = sns.load_dataset('planets')
# create the kde model
kde = gaussian_kde(df.mass.dropna())
# plot
fig, ax = plt.subplots(figsize=(9, 6))
g = sns.kdeplot(data=df.mass, ax=ax, c='k')
# remove margins; optional
g.margins(x=0, y=0)
# get the min and max of the x-axis
xmin, xmax = g.get_xlim()
# create points between the min and max
x = np.linspace(xmin, xmax, 1000)
# calculate the y values from the model
kde_y = kde(x)
# select x values below 0
x0 = x[x < 0]
# get the len, which will be used for slicing the other arrays
x0_len = len(x0)
# slice the arrays
y0 = kde_y[:x0_len]
x1 = x[x0_len:]
y1 = kde_y[x0_len:]
# calculate the area under the curves
area0 = np.round(simps(y0, x0, dx=1) * 100, 0)
area1 = np.round(simps(y1, x1, dx=1) * 100, 0)
# fill the areas
g.fill_between(x=x0, y1=y0, color='r', alpha=.5)
g.fill_between(x=x1, y1=y1, color='b', alpha=.5)
# annotate
g.annotate(f'{area0:.0f}%', xy=(-1, 0.075), xytext=(10, 0.150), arrowprops=dict(arrowstyle="->", color='r', alpha=.5))
g.annotate(f'{area1:.0f}%', xy=(1, 0.05), xytext=(10, 0.125), arrowprops=dict(arrowstyle="->", color='b', alpha=.5))
大多数答案建议使用
seaborn
,但是仅使用 pandas
和 matplotlib
即可轻松完成。
# Create a dataframe with random data
df = pd.DataFrame({
'x': np.random.randn(1000),
})
# Plot the KDE outline of the data
df['x'].plot.kde()
# Retrieve the data plotted in the last line
kde_data = plt.gca().get_lines()[-1].get_data()
# Fill the area below the KDE curve using fill_between
plt.fill_between(kde_data[0], 0, kde_data[1], color='red', alpha=0.1)
对于您的用例,您可以使用两个不同的
fill_between
,每个都有数据的子样本。
# Create a dataframe with random data
df = pd.DataFrame({
'x': np.random.randn(1000),
})
# Plot the KDE outline of the data
df['x'].plot.kde()
# Retrieve the data plotted in the last line
kde_data = plt.gca().get_lines()[-1].get_data()
# Fill the area below the KDE curve using fill_between
plt.fill_between(kde_data[0][0:500], 0, kde_data[1][0:500], color='red', alpha=0.1)
plt.fill_between(kde_data[0][500:], 0, kde_data[1][500:], color='blue', alpha=0.1)
并得到以下图。