[我正在尝试绘制过去几周的销售趋势。但是在x轴上,刻度没有按照正确的排序,这使我的图表看起来很奇怪。
如您所见,带圆圈的星期需要以起点或轴为起点。为什么即使在对数据框中的日期进行排序后,也会发生这种情况?
熊猫码:
basic_df = filterDataFrameByDate(df,start_date,end_date)
df = basic_df.groupby(['S2PName',basic_df['S2BillDate'].dt.to_period(flag)], sort=False)['S2PGTotal'].agg([('totSale','sum'),('count','size')]).reset_index()
df.sort_values('S2BillDate',inplace=True)
df['S2BillDate'] = df['S2BillDate'].astype('str')
我还注意到的另一件事是,当我从图例中取消选择“ Samosa”时,刻度线排列正确。
有人可以帮我这个忙吗?
熊猫代码和数据:
代码:
print(df['S2BillDate'].unique())
OP:
<PeriodArray>
['2020-02-03/2020-02-09', '2020-02-10/2020-02-16', '2020-02-17/2020-02-23',
'2020-02-24/2020-03-01']
Length: 4, dtype: period[W-SUN]
代码:
df = basic_df.groupby(['S2PName',basic_df['S2BillDate'].dt.to_period(flag)], sort=False)['S2PGTotal'].agg([('totSale','sum'),('count','size')]).reset_index()
OP:
[537 rows x 4 columns]
S2PName S2BillDate totSale count
0 SAMOSA 2020-02-10/2020-02-16 4057.89 228
1 COFFEE 2020-02-10/2020-02-16 10567.21 582
2 TEA 2020-02-10/2020-02-16 6808.92 445
3 POORI 2020-02-10/2020-02-16 7556.77 179
4 PONGAL 2020-02-10/2020-02-16 4758.97 122
.. ... ... ... ...
411 PEPPER CHICKEN 2020-02-24/2020-03-01 90.00 1
412 SEZWAN CHICKEN FRIED NOODLES 2020-02-24/2020-03-01 199.50 2
413 SEZWAN VEG FRIED RICE 2020-02-24/2020-03-01 69.83 1
414 SEZWAN EGG FRIED RICE 2020-02-24/2020-03-01 89.78 1
415 EGG MASALA 2020-02-24/2020-03-01 50.04 1
我可以重现您的问题。我正在使用plotly.express
,但与plotly.graph_objs
import pandas as pd
import plotly.express as px
df = pd.DataFrame({"SPName":["SAMOSA"]*3+ ["COFFEE"]*4,
"S2BillDate":["2020-02-10/2020-02-16",
"2020-02-17/2020-02-23",
"2020-02-24/2020-03-01",
"2020-02-24/2020-03-01",
"2020-02-17/2020-02-23",
"2020-02-10/2020-02-16",
"2020-02-03/2020-02-09"],
"totSale":[4000, 4500, 5000, 10_000, 12_000, 10_000, 2000]})
此产品
fig = px.line(df, x="S2BillDate", y="totSale", color="SPName")
fig.update_traces(mode='markers+lines')
fig.show()
这里的问题是日期如何排序。如果您看到COFFEE
的第一个点是2020-02-24/2020-03-01
,第二个2020-02-17/2020-02-23
等等。
将快速解决
df1 = df.sort_values("S2BillDate").reset_index(drop=True)
fig = px.line(df1, x="S2BillDate", y="totSale", color="SPName")
fig.update_traces(mode='markers+lines')
我个人更喜欢使用日期而不是xaxis
上的字符串
df["Date"] = df["S2BillDate"].str.split("/").str[1].astype("M8")
fig = px.line(df, x="Date", y="totSale", color="SPName")
fig.update_traces(mode='markers+lines')
```[![enter image description here][3]][3]
but in this case in order to show the ticktext in the format you asked for you still need to sort `df` and in this case there you need more coding.
```python
df = df.sort_values(["Date"]).reset_index(drop=True)
fig = px.line(df, x="Date", y="totSale", color="SPName")
fig.update_traces(mode='markers+lines')
fig.update_layout(
xaxis = dict(
type="category",
tickmode = 'array',
tickvals = df["Date"].tolist(),
ticktext = df["S2BillDate"].tolist()
)
)
fig.show()