这是我正在使用的数据集如下:
data = [['2608 W SYLVESTER ST', 'PASCO', 'WA', 4304],
['61 W MESQUITE BLVD', 'MESQUITE', 'NV', 115000],
['287 NW 3RD AVE', 'ESTACADA', 'OR', 1000],
['287 NW 3RD AVE', 'ESTACADA', 'OR', 2000],
['287 NW 3RD AVE', 'ESTACADA', 'OR', 7000]])
数据框的显示:
site_address site_city site_state price
0 2608 W SYLVESTER ST PASCO WA 4304
1 61 W MESQUITE BLVD MESQUITE NV 115000
2 287 NW 3RD AVE ESTACADA OR 1000
3 287 NW 3RD AVE ESTACADA OR 2000
4 287 NW 3RD AVE ESTACADA OR 7000
需要输出如下JSON结构:
[
"sites": [
{
"location": "2608 W SYLVESTER ST, PASCO WA",
"value": 4304
},
{
"location": "61 W MESQUITE BLVD, MESQUITE NV",
"value": 115000
},
{
"location": "287 NW 3RD AVE, ESTACADA OR",
"value": 10000
}
]
尝试使用 pandas、groupby 和 agg 功能:
df_grp = df.groupby('site_address', as_index=False).agg(**{
'location': ** NEED HELP HERE **,
'value': ('price', 'sum')
}).get(['location', 'value']).reset_index(drop=True)
result = json.loads(df_grp.to_json(orient='records'))
print(result)
尝试:
df = df.groupby(["site_address", "site_city", "site_state"]).agg("sum").reset_index()
print(
{
"sites": df.apply(
lambda x: {
"location": f"{x['site_address']} {x['site_city']} {x['site_state']}",
"value": x["price"],
},
axis=1,
).to_list()
}
)
打印:
{
"sites": [
{"location": "2608 W SYLVESTER ST PASCO WA", "value": 4304},
{"location": "287 NW 3RD AVE ESTACADA OR", "value": 10000},
{"location": "61 W MESQUITE BLVD MESQUITE NV", "value": 115000},
]
}