我有以下数据框:
Date rank_. category genre breakdown count
0 2022-01-01 top 50 dutch hiphop 7
1 2022-01-01 top 50 dutch pop 12
2 2022-01-01 top 50 other 31
3 2022-01-01 top 50 other dutch 0
4 2022-01-01 top 100 dutch hiphop 2
... ... ... ... ...
5515 2023-04-05 top 100 other dutch 0
5516 2023-04-05 top 200 dutch hiphop 11
5517 2023-04-05 top 200 dutch pop 16
5518 2023-04-05 top 200 other 74
5519 2023-04-05 top 200 other dutch 1
如果我创建一个如下图所示的堆叠条形图,这会很好用。但是......如果我将数据框更改为具有一周格式(如
37-2022
)并尝试在x轴上绘制它我得到一个空图。
我认为它与日期格式与字符串有关......但我被卡住了。
任何帮助将不胜感激!
counts = df.groupby(['Date', 'genre breakdown'])['track_isrc'].nunique()
# create a new dataframe with the counts
counts_df = counts.reset_index(name='count')
# Calculate percentage of total views for each genre breakdown within each date
counts_df['percent'] = counts_df.groupby('Date')['count'].apply(lambda x: x / x.sum() * 100)
# Define the category order for the 'genre breakdown' variable
category_order = ['Dutch Pop', 'Dutch HipHop']
# Create a stacked bar graph with percentages
fig = px.bar(counts_df, x='Date', y='percent', color='genre breakdown', barmode='stack',
category_orders={"genre breakdown": category_order},
color_discrete_sequence=px.colors.qualitative.Pastel)
由于像
{week number}-{year}
这样的日期格式不是很标准,如果您修改 DataFrame 中的 Date
列以适应这种格式,plotly 可能无法正确解释它。
你应该做的是直接在情节中指定
tickformat
:
fig.update_xaxes(rangeslider_visible=True, tickformat='%W-%Y')
这是一个使用类似于您的 DataFrame 的示例
counts_df
:
import numpy as np
import pandas as pd
import plotly.express as px
## create sample counts df similar to yours
np.random.seed(42)
n_days = len(pd.date_range('2022-01-01','2023-04-05'))
counts_df = pd.DataFrame({
'Date': list(pd.date_range('2022-01-01','2023-04-05'))*4,
'genre breakdown': np.repeat(['dutch hiphop','dutch pop','other','other dutch'], n_days),
'count': np.concatenate([np.random.uniform(10,20,n_days), np.random.uniform(10,20,n_days),
np.random.uniform(40,50,n_days), np.random.uniform(1,5,n_days)])
})
# Calculate percentage of total views for each genre breakdown within each date
counts_df['percent'] = counts_df.groupby('Date', group_keys=False)['count'].apply(lambda x: x / x.sum() * 100)
# Define the category order for the 'genre breakdown' variable
category_order = ['Dutch Pop', 'Dutch HipHop']
# Create a stacked bar graph with percentages
fig = px.bar(counts_df, x='Date', y='percent', color='genre breakdown', barmode='stack',
category_orders={"genre breakdown": category_order},
color_discrete_sequence=px.colors.qualitative.Pastel)
fig.update_xaxes(rangeslider_visible=True, tickformat='%W-%Y')
fig.update_yaxes(range=[0,100])
fig.show()