我的 onlclick 与 Seaborn 散点图配合得很好,但是当我使用 stripplot 并单击任何点时,它会返回错误的数据。 stripplot 支持 onlclick 吗? 这是我的代码片段:
def relation_graph(x_series, y_series, hue_series, sheetname):
df = open_file_read_to_df(filename, sheetname)
def onpick(event):
ind = event.ind
print('7-digit DN:', df.iloc[ind].index.values)
plots = sns.scatterplot(y= y_series, x = x_series, hue=hue_series, data=df, picker=4)
简化版:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
def relation_graph(df, x_series, y_series, hue_series):
def onpick(event):
ind = event.ind
print('Found:', df.iloc[ind].index.values)
plots = sns.stripplot(y= y_series, x = x_series, hue=hue_series, data=df, picker=4)
plots.figure.canvas.mpl_connect("pick_event", onpick)
plt.show()
data = [["test1", 1, 2, "red"], ["test2", 1, 2, "blue"], ["test3", 2, 5, "red"], ["test4", 3, 2, "blue"]]
df = pd.DataFrame(data, columns=['name', 'X', 'Y', "Color"])
df.set_index("name", inplace=True)
relation_graph(df, df['X'], df['Y'], df['Color'])
正如 JohanC 所评论的,stripplot 会重新排序数据并将其拆分为 PathCollection 对象。因为seaborn 带状图(也称为群图)的创建方式实际上是多个 PathCollection 对象的组合,每个数据分割都有一个对象(即每个类别位于 x 轴及其色调上)。另外,event.artist 对象作为 onpick 中的 PathCollection 对象出现,而 event.ind 实际上只是特定 PathCollection 对象中每个点的标识符;这是您获得数据帧的错误部分的主要原因。现在不用直接使用 event.ind,而是这样做:
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
data = [
["test1", 1.0, "Cat", "male"],
["test2", 1.5, "Cat", "male"],
["test3", 1.2, "Cat", "female"],
["test4", 1.1, "Cat", "female"],
["test5", 3.5, "Dog", "female"],
["test6", 2.2, "Dog", "male"],
["test7", 2.1, "Dog", "male"],
["test8", 2.3, "Dog", "female"],
["test9", 2.0, "Dog", "male"],
["test10", 2.15, "Dog", "female"],
["test11", 0.5, "Duck", "female"],
["test12", 0.4, "Duck", "male"],
["test13", 0.6, "Duck", "female"],
]
df = pd.DataFrame(data, columns=["name", "Size", "Animal", "Label"])
df.set_index("name", inplace=True)
def onpick(event):
name = df[df["Label"] == event.artist.label][
df["Animal"] == event.artist.animal
].iloc[event.ind]
print("Found:", name)
axes = sns.stripplot(x="Animal", y="Size", hue="Label", data=df, picker=4, dodge=True)
groups = df["Label"].unique()
splits = df["Animal"].unique()
print(axes.collections)
group_len = len(groups)
for idx, artist in enumerate(axes.collections):
artist.animal = splits[idx // group_len]
artist.label = groups[idx % group_len]
print(artist.animal, artist.label)
axes.figure.canvas.mpl_connect("pick_event", onpick)
plt.show()
例如这里有 6 个 PathCollection 对象:猫男、猫女、狗男、狗女、鸭男、鸭女。然后,我们可以将这些属性分配给每个 PathCollection,一个用于动物,一个用于标签,以过滤 onpick 内的数据并适当地使用 event.ind。