条件基于pandas Dataframe中的ffil()

问题描述 投票:1回答:1

我问了一个类似的问题并得到了一些帮助,但经过更多的测试后,解决方案无效。我不知道如何打开现有的机票,所以我会再问一次。我有一个数据框,我试图根据条件填充一列但我不能让它在每个实例中正常工作。

如果IN列中有1并且200D列有1,我希望在测试列中插入1。但是一旦200D列变为0,TEST列不应该返回到1,直到有1在IN列中。这是一个测试示例DF和我到目前为止的代码

df['TEST']=df.loc[df.IN==1,'IN'] 
df['TEST'] = df.loc[df['200D_MA']==1,'TEST'].ffill()
df['TEST'].fillna(0,inplace=True)

        Date       IN   200D    TEST    
        12/6/2013   0     1      0  
        12/9/2013   0     1      0  
        12/10/2013  1     1      1  IN and 200D 1 >> TEST =1
        12/11/2013  0     1      1  1 to Carry Down as long as 200D =1
        12/16/2013  0     1      1  Carry Down as long as 200D =1
        12/17/2013  0     1      1  
        12/18/2013  0     0      0  TEST = 0 bc 200D =0
        12/19/2013  0     0      0  
        12/20/2013  0     0      0  
        12/23/2013  0     1      1  WRONG > TEST SHOULD BE 0 bc IN not 1
        12/24/2013  0     1      1  WRONG > TEST SHOULD BE 0 bc IN not 1
        12/25/2013  0     1      1  WRONG > TEST SHOULD BE 0 bc IN not 1
        12/26/2013  1     0      0  
        12/27/2013  1     0      0  
        12/28/2013  0     1      1  
        12/29/2013  1     1      1  IN and 200D 1 >> TEST =1
        12/30/2013  0     1      1  
        12/31/2013  0     1      1  
        1/1/2014    0     0      0  
        1/2/2014    1     0      0  TEST=1 but 200D =0 >> TEST =0
        1/3/2014    0     0      0  
        1/6/2014    0     0      0  
        1/7/2014    1     1      1  IN and 200D 1 >> TEST =1
        1/8/2014    0     1      1  
python pandas
1个回答
0
投票

我用numpy解决了你的查询,看看

import pandas as pd
import numpy as np

df = pd.DataFrame({'Date': ['12-06-2013', '12-09-2013', '12-10-2013', '12-11-2013', '12/16/2013',
                            '12/17/2013', '12/18/2013', '12/19/2013', '12/20/2013', '12/23/2013',
                            '12/24/2013', '12/25/2013', '12/26/2013', '12/27/2013', '12/28/2013',
                            '12/29/2013', '12/30/2013', '12/31/2013', '01-01-2014', '01-02-2014',
                            '01-03-2014', '01-06-2014', '01-07-2014', '01-08-2014'],
                   'IN': [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0],
                   '200D': [1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1]})

df = df[['Date', 'IN', '200D']]

df['TEST'] = np.where((df['IN'] == 1) | (df['200D'] == 1), 1, 0)

print(df)

结果看起来就像你要求的那样

          Date  IN  200D  TEST
0   12-06-2013   0     1     1
1   12-09-2013   0     1     1
2   12-10-2013   1     1     1
3   12-11-2013   0     1     1
4   12/16/2013   0     1     1
5   12/17/2013   0     1     1
6   12/18/2013   0     0     0
7   12/19/2013   0     0     0
8   12/20/2013   0     0     0
9   12/23/2013   0     1     1
10  12/24/2013   0     1     1
11  12/25/2013   0     1     1
12  12/26/2013   1     0     1
13  12/27/2013   1     0     1
14  12/28/2013   0     1     1
15  12/29/2013   1     1     1
16  12/30/2013   0     1     1
17  12/31/2013   0     1     1
18  01-01-2014   0     0     0
19  01-02-2014   1     0     1
20  01-03-2014   0     0     0
21  01-06-2014   0     0     0
22  01-07-2014   1     1     1
23  01-08-2014   0     1     1
© www.soinside.com 2019 - 2024. All rights reserved.