是否有循环并将函数应用到 Polars 列的首选方法?
这是我正在尝试做的事情的 pandas 示例:
df1 = pl.DataFrame(
{
"A": np.random.rand(10),
"B": np.random.rand(10),
"C": np.random.rand(10)
}
)
df2 = pl.DataFrame(
{
"X1": np.random.rand(10),
"X2": np.random.rand(10),
"X3": np.random.rand(10)
}
)
# pandas code
# this is just a weighted sum of df2, where the weights are from df1
df1.to_pandas().apply(
lambda weights: df2.to_pandas().mul(weights, axis=0).sum() / weights.sum(), axis=0,
result_type='expand'
)
A B C
X1 0.647355 0.705358 0.692214
X2 0.500439 0.416325 0.384294
X3 0.601890 0.606301 0.577076
看起来你正在做相当于:
pd.concat(
(df1.mul(b[col], axis=0).sum() / df1.sum() for col in df2.columns),
axis=1
)
在极坐标中可以用几乎相同的方式编写:
pl.concat((df1 * col).sum() / df1.sum() for col in df2)
shape: (3, 3)
┌──────────┬──────────┬──────────┐
│ A | B | C │
│ --- | --- | --- │
│ f64 | f64 | f64 │
╞══════════╪══════════╪══════════╡
│ 0.363931 | 0.44298 | 0.431432 │
│ 0.54025 | 0.42028 | 0.418826 │
│ 0.506882 | 0.576332 | 0.61857 │
└──────────┴──────────┴──────────┘