我有数据帧,称为df
Category block array_size num_node num_task time
DATA 2 100 1 1 0.104
DATA 2 100 1 2 0.348
DATA 2 100 1 1 2.837
DATA 2 1000 1 1 29.188
DATA 2 1000 1 1 284.087
有了这个数据框,我想找出每个配置的mean
值。
所以我想要的变量是(df_foo _ {#block} _ {#array_size} _ {#num_task}),
df_foo_2_100_1 = df.loc[
(df["num_task"] == 1) &
(df["block"] == 2) &
(df["array_size"] == 100)]["time"].mean()
df_foo_2_1000_1 = df.loc[
(df["num_task"] == 1) &
(df["block"] == 2) &
(df["array_size"] == 1000)]["time"].mean()
如何使用循环自动创建这些变量?
谢谢!
你可以做groupby
df.loc[(df["num_task"] == 1) & (df["block"] == 2)].groupby('array_size').time.mean()
Out[206]:
array_size
100 1.4705
1000 156.6375
Name: time, dtype: float64
好像你需要的
df.groupby(['num_task','block','array_size']).time.mean()
Out[208]:
num_task block array_size
1 2 100 1.4705
1000 156.6375
2 2 100 0.3480
Name: time, dtype: float64