使用matplotlib绘制Char数据的Pandas DataFrame

Question

数据集中的数据纯粹由字符组成。例如：

p,x,s,n,t,p,f,c,n,k,e,e,s,s,w,w,p,w,o,p,k,s,u
e,x,s,y,t,a,f,c,b,k,e,c,s,s,w,w,p,w,o,p,n,n,g
e,b,s,w,t,l,f,c,b,n,e,c,s,s,w,w,p,w,o,p,n,n,m
p,x,y,w,t,p,f,c,n,n,e,e,s,s,w,w,p,w,o,p,k,s,u
e,x,s,g,f,n,f,w,b,k,t,e,s,s,w,w,p,w,o,e,n,a,g

可以在agaricus-lepiota.data in the uci machine learning datasets mushroom dataset中找到完整的数据副本

是否有通过matplotlib使用char数据（而不是必须将数据集转换为数字）的可视化方法？

只是为了任何形式的可视化，即：

filename = 'mushrooms.csv'
df_mushrooms = pd.read_csv(filename, names = ["Classes", "Cap-Shape", "Cap-Surface", "Cap-Colour", "Bruises", "Odor", "Gill-Attachment", "Gill-Spacing", "Gill-Size", "Gill-Colour", "Stalk-Shape", "Stalk-Root", "Stalk-Surface-Above-Ring", "Stalk-Surface-Below-Ring", "Stalk-Colour-Above-Ring", "Stalk-Colour-Below-Ring", "Veil-Type", "Veil-Colour", "Ring-Number", "Ring-Type", "Spore-Print-Colour", "Population", "Habitat"])


#If there are any entires (rows) with any missing values/NaN's drop the row.
df_mushrooms.dropna(axis = 0, how = 'any', inplace = True)

df_mushrooms.plot.scatter(x = 'Classes', y = 'Cap-Shape')

Answer 1

可以这样做，但是从图形的角度来看，这种方法并没有任何意义。如果你按照你的要求去做它会是这样的：

我知道我不应该告诉某人如何展示他们的图表，但这并没有向我传达任何信息。问题是使用Classes和Cap-Shape字段为你的x和y索引将始终在同一个地方放置相同的字母。没有变化。也许还有一些其他字段可以用作索引，然后使用Cap-Shape作为标记，但因为它不会添加任何值。这对我个人而言。

要使用字符串作为标记，您可以使用matplotlib.markers中描述的“$ ... $”标记，但我必须再次提供警告，这样的图形比传统方法慢得多，因为您必须遍历您的行数据帧。

fig, ax = plt.subplots()
# Classes only has 'p' and 'e' as unique values so we will map them as 1 and 2 on the index
df['Class_Id'] = df.Classes.map(lambda x: 1 if x == 'p' else 2)
df['Cap_Val'] = df['Cap-Shape'].map(lambda x: ord(x) - 96)
for idx, row in df.iterrows():
    ax.scatter(x=row.Class_Id, y=row.Cap_Val, marker=r"$ {} $".format(row['Cap-Shape']), c=plt.cm.nipy_spectral(row.Cap_Val / 26))
ax.set_xticks([0,1,2,3])
ax.set_xticklabels(['', 'p', 'e', ''])
ax.set_yticklabels(['', 'e', 'j', 'o', 't', 'y'])
fig.show()

使用matplotlib绘制Char数据的Pandas DataFrame

问题描述投票：1回答：1

1个回答

最新问题

使用matplotlib绘制Char数据的Pandas DataFrame

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1