通过仅有条件地检查每组第一行的值来过滤组的最佳方法是什么？

问题描述投票：0回答：1

这是我的数据框：

import pandas as pd
df = pd.DataFrame(
    {
        'group': list('xxxxyyy'),
        'open': [100, 150, 200, 160, 300, 150, 170],
        'close': [105, 150, 200, 160, 350, 150, 170],
        'stop': [104, 104, 104, 104, 400, 400, 400]
    }
)

预期输出基于

列返回组

group

：

  group  open  close  stop
0     x   100    105   104
1     x   150    150   104
2     x   200    200   104
3     x   160    160   104

逻辑：

我想检查每组的

df.stop.iloc[0]

是否在

df.open.iloc[0]

和

df.close.iloc[0]

之间。如果是在这两者之间，我想退回整个组。

这是我的尝试。它有效，但我认为有更好的方法来做到这一点。请注意，在

if

子句中，需要检查这两个条件。

def func(df):
    s = df.stop.iloc[0]
    o = df.open.iloc[0]
    c = df.close.iloc[0]

    if (o <= s <= c) or (c <= s <= o):
        return df

out = df.groupby('group').apply(func).reset_index(drop=True)

python pandas dataframe group-by

1个回答

0
投票

一个简单的方法是

groupby

并构建一个迭代器：

next(iter(df.groupby('group')))[1]

另一种使用面罩的方法：

df[df['group'].eq(df['group'].iloc[0]).cummin()]

输出：

  group  open  close  stop
0     x   100    105   104
1     x   150    150   104
2     x   200    200   104
3     x   160    160   104

最新问题

© www.soinside.com 2019 - 2024. All rights reserved.