使用向量化函数而不是循环

问题描述 投票:0回答:1

我的函数是这样的,它在数据帧范围内取最小值,该范围的长度不断增加,在数据帧范围内取最大值,每次迭代长度都会减少。

要执行此计算的数据帧是另一个更大数据帧本身的子集,这会导致嵌套循环,从而显着增加时间复杂度。

def drawdown(result_df, dict_dfs):
    last_year_df = pd.DataFrame(data=np.nan, index=result_df.index, columns=result_df.columns)

    for idx in range(len(result_df)):
        for stock in result_df.columns:
            past_date_idx = max(0, idx - 250)
            past_date = result_df.index[past_date_idx]
            current_date = result_df.index[idx]

            last_year = dict_dfs['close'].loc[past_date:current_date, stock]

            drawdowns = []
            for i in range(len(last_year)):
                rolling_min = last_year.iloc[:i + 1].min()
                rolling_max = last_year.iloc[i:].max()
                if rolling_min != 0:
                    drawdown = (rolling_max - rolling_min) / rolling_min
                    drawdowns.append(drawdown)

            last_year_df.iloc[idx][stock] = np.median(drawdowns)

    return last_year_df

有了这段代码,有什么功能可以帮助我提高速度吗?如果是,那么我应该进行哪些更改以使代码逻辑相同,但我不使用循环,而是使用矢量化函数!

@@Micheal

发表评论后
def drawdown(result_df, dict_dfs):
    last_year_df = pd.DataFrame(data=np.nan, index=result_df.index, columns=result_df.columns)

    for idx in range(len(result_df)):
        for stock in result_df.columns:
            past_date_idx = max(0, idx - 250)
            past_date = result_df.index[past_date_idx]
            current_date = result_df.index[idx]

            last_year = dict_dfs['close'].loc[past_date:current_date, stock]
            max_dataframe = last_year.iloc[::-1]
            min_dataframe = last_year

            rolling_max = max_dataframe.cummax()
            rolling_min = min_dataframe.cummin()

            drawdown = (rolling_max - rolling_min) / rolling_min

            last_year_df.loc[current_date] = drawdown.iloc[-1]

    return last_year_df

乍一看逻辑似乎是正确的,但结果却并非如此。我在这两种情况下的输出是不同的。有人可以建议这里出了什么问题吗?

以下是 result_df 的示例

    ABB IN Equity   ABNL IN Equity
01-02-2005  FALSE   FALSE
02-02-2005  FALSE   FALSE
03-02-2005  FALSE   FALSE
04-02-2005  FALSE   FALSE
07-02-2005  FALSE   FALSE
08-02-2005  FALSE   FALSE
09-02-2005  FALSE   FALSE
10-02-2005  FALSE   FALSE
11-02-2005  FALSE   FALSE
14-02-2005  FALSE   FALSE
15-02-2005  FALSE   FALSE
16-02-2005  FALSE   FALSE
17-02-2005  FALSE   FALSE
18-02-2005  FALSE   FALSE
21-02-2005  FALSE   FALSE
22-02-2005  FALSE   FALSE
23-02-2005  FALSE   FALSE
24-02-2005  FALSE   FALSE
25-02-2005  FALSE   FALSE
28-02-2005  FALSE   FALSE
01-03-2005  FALSE   FALSE
02-03-2005  FALSE   FALSE
03-03-2005  FALSE   FALSE
04-03-2005  FALSE   FALSE
07-03-2005  FALSE   FALSE
08-03-2005  FALSE   FALSE
09-03-2005  FALSE   FALSE
10-03-2005  FALSE   FALSE
11-03-2005  FALSE   FALSE
14-03-2005  FALSE   FALSE
15-03-2005  FALSE   FALSE
16-03-2005  FALSE   FALSE
17-03-2005  FALSE   FALSE
18-03-2005  FALSE   FALSE
21-03-2005  FALSE   FALSE
22-03-2005  FALSE   FALSE
23-03-2005  FALSE   FALSE
24-03-2005  FALSE   FALSE
28-03-2005  FALSE   FALSE
29-03-2005  FALSE   FALSE
30-03-2005  FALSE   FALSE
31-03-2005  FALSE   FALSE
01-04-2005  FALSE   FALSE
04-04-2005  FALSE   FALSE
05-04-2005  FALSE   FALSE
06-04-2005  FALSE   FALSE
07-04-2005  FALSE   FALSE
08-04-2005  FALSE   FALSE
11-04-2005  FALSE   FALSE
12-04-2005  FALSE   FALSE
13-04-2005  FALSE   FALSE
15-04-2005  FALSE   FALSE
18-04-2005  FALSE   FALSE
19-04-2005  FALSE   FALSE
20-04-2005  FALSE   FALSE
21-04-2005  FALSE   FALSE
22-04-2005  FALSE   FALSE
25-04-2005  FALSE   FALSE
26-04-2005  FALSE   FALSE
27-04-2005  FALSE   FALSE
28-04-2005  FALSE   FALSE
29-04-2005  FALSE   FALSE
02-05-2005  FALSE   FALSE
03-05-2005  FALSE   FALSE
04-05-2005  FALSE   FALSE
05-05-2005  FALSE   FALSE
06-05-2005  FALSE   FALSE
09-05-2005  FALSE   FALSE
10-05-2005  FALSE   FALSE
11-05-2005  FALSE   FALSE
12-05-2005  FALSE   FALSE
13-05-2005  FALSE   FALSE
16-05-2005  FALSE   FALSE
17-05-2005  FALSE   FALSE
18-05-2005  FALSE   FALSE
19-05-2005  FALSE   FALSE
20-05-2005  FALSE   FALSE
23-05-2005  FALSE   FALSE
24-05-2005  FALSE   FALSE
25-05-2005  FALSE   FALSE
26-05-2005  FALSE   FALSE
27-05-2005  FALSE   FALSE
30-05-2005  FALSE   FALSE
31-05-2005  FALSE   FALSE
01-06-2005  FALSE   FALSE
02-06-2005  FALSE   FALSE
03-06-2005  FALSE   FALSE
04-06-2005  FALSE   FALSE
06-06-2005  FALSE   FALSE
07-06-2005  FALSE   FALSE
08-06-2005  FALSE   FALSE
09-06-2005  FALSE   FALSE
10-06-2005  FALSE   FALSE
13-06-2005  FALSE   FALSE
14-06-2005  FALSE   FALSE
15-06-2005  FALSE   FALSE
16-06-2005  FALSE   FALSE
17-06-2005  FALSE   FALSE
20-06-2005  FALSE   FALSE
21-06-2005  FALSE   FALSE
22-06-2005  FALSE   FALSE
23-06-2005  FALSE   FALSE
24-06-2005  FALSE   FALSE
27-06-2005  FALSE   FALSE
28-06-2005  FALSE   FALSE
29-06-2005  FALSE   FALSE
30-06-2005  FALSE   FALSE
01-07-2005  FALSE   FALSE
04-07-2005  FALSE   FALSE
05-07-2005  FALSE   FALSE
06-07-2005  FALSE   FALSE
07-07-2005  FALSE   FALSE
08-07-2005  FALSE   FALSE
11-07-2005  TRUE    FALSE
12-07-2005  FALSE   FALSE
13-07-2005  FALSE   FALSE
14-07-2005  FALSE   FALSE
15-07-2005  FALSE   FALSE
18-07-2005  FALSE   FALSE
19-07-2005  FALSE   FALSE
20-07-2005  FALSE   FALSE
21-07-2005  FALSE   FALSE
22-07-2005  FALSE   FALSE
25-07-2005  FALSE   FALSE
26-07-2005  FALSE   FALSE
27-07-2005  FALSE   FALSE
29-07-2005  FALSE   FALSE
01-08-2005  FALSE   FALSE
02-08-2005  FALSE   FALSE
03-08-2005  FALSE   FALSE
04-08-2005  FALSE   FALSE
05-08-2005  FALSE   FALSE
08-08-2005  FALSE   FALSE
09-08-2005  FALSE   FALSE
10-08-2005  FALSE   FALSE
11-08-2005  FALSE   FALSE
12-08-2005  FALSE   FALSE
16-08-2005  FALSE   FALSE
17-08-2005  FALSE   FALSE
18-08-2005  FALSE   FALSE
19-08-2005  FALSE   FALSE
22-08-2005  FALSE   FALSE
23-08-2005  FALSE   FALSE
24-08-2005  FALSE   FALSE
25-08-2005  FALSE   FALSE
26-08-2005  FALSE   FALSE
29-08-2005  FALSE   FALSE
30-08-2005  FALSE   FALSE
31-08-2005  FALSE   FALSE
01-09-2005  FALSE   FALSE
02-09-2005  FALSE   FALSE
05-09-2005  FALSE   FALSE
06-09-2005  FALSE   FALSE
08-09-2005  FALSE   FALSE
09-09-2005  FALSE   FALSE
12-09-2005  FALSE   FALSE
13-09-2005  FALSE   FALSE
14-09-2005  FALSE   FALSE
15-09-2005  FALSE   FALSE
16-09-2005  FALSE   FALSE
19-09-2005  FALSE   FALSE
20-09-2005  FALSE   FALSE
21-09-2005  FALSE   FALSE
22-09-2005  FALSE   FALSE
23-09-2005  FALSE   FALSE
26-09-2005  FALSE   FALSE
27-09-2005  FALSE   FALSE
28-09-2005  FALSE   FALSE
29-09-2005  FALSE   FALSE
30-09-2005  FALSE   FALSE
03-10-2005  FALSE   FALSE
04-10-2005  FALSE   FALSE
05-10-2005  FALSE   FALSE
06-10-2005  FALSE   FALSE
07-10-2005  FALSE   FALSE
10-10-2005  FALSE   FALSE
11-10-2005  FALSE   FALSE
13-10-2005  FALSE   FALSE
14-10-2005  FALSE   FALSE
17-10-2005  FALSE   FALSE
18-10-2005  FALSE   FALSE
19-10-2005  FALSE   FALSE
20-10-2005  FALSE   FALSE
21-10-2005  FALSE   FALSE
24-10-2005  FALSE   FALSE
25-10-2005  FALSE   FALSE
26-10-2005  FALSE   FALSE
27-10-2005  FALSE   FALSE
28-10-2005  FALSE   FALSE
31-10-2005  FALSE   FALSE
01-11-2005  FALSE   FALSE
02-11-2005  FALSE   FALSE
07-11-2005  FALSE   FALSE
08-11-2005  FALSE   FALSE
09-11-2005  FALSE   FALSE
10-11-2005  FALSE   FALSE
11-11-2005  FALSE   FALSE
14-11-2005  FALSE   FALSE
16-11-2005  FALSE   FALSE
17-11-2005  FALSE   FALSE
18-11-2005  FALSE   FALSE
21-11-2005  FALSE   FALSE
22-11-2005  FALSE   FALSE
23-11-2005  FALSE   FALSE
24-11-2005  FALSE   FALSE
25-11-2005  FALSE   FALSE
26-11-2005  FALSE   FALSE
28-11-2005  FALSE   FALSE
29-11-2005  FALSE   FALSE
30-11-2005  FALSE   FALSE
01-12-2005  FALSE   FALSE
02-12-2005  FALSE   FALSE
05-12-2005  FALSE   FALSE
06-12-2005  FALSE   FALSE
07-12-2005  FALSE   FALSE
08-12-2005  FALSE   FALSE
09-12-2005  FALSE   FALSE
12-12-2005  FALSE   FALSE
13-12-2005  FALSE   FALSE
14-12-2005  FALSE   FALSE
15-12-2005  FALSE   FALSE
16-12-2005  FALSE   FALSE
19-12-2005  FALSE   FALSE
20-12-2005  FALSE   FALSE
21-12-2005  FALSE   FALSE
22-12-2005  FALSE   FALSE
23-12-2005  FALSE   FALSE
26-12-2005  FALSE   FALSE
27-12-2005  FALSE   FALSE
28-12-2005  FALSE   FALSE
29-12-2005  FALSE   FALSE
30-12-2005  FALSE   FALSE
02-01-2006  FALSE   FALSE
03-01-2006  FALSE   FALSE
04-01-2006  FALSE   FALSE
05-01-2006  FALSE   FALSE
06-01-2006  FALSE   FALSE
09-01-2006  FALSE   FALSE
10-01-2006  FALSE   FALSE
12-01-2006  FALSE   FALSE
13-01-2006  FALSE   FALSE
16-01-2006  FALSE   FALSE
17-01-2006  FALSE   FALSE
18-01-2006  FALSE   FALSE
19-01-2006  FALSE   FALSE
20-01-2006  TRUE    FALSE
23-01-2006  FALSE   FALSE
24-01-2006  FALSE   FALSE
25-01-2006  TRUE    FALSE
27-01-2006  TRUE    FALSE
30-01-2006  FALSE   FALSE
31-01-2006  FALSE   FALSE
01-02-2006  FALSE   FALSE
02-02-2006  FALSE   FALSE
03-02-2006  FALSE   FALSE
06-02-2006  FALSE   FALSE
07-02-2006  FALSE   FALSE
08-02-2006  FALSE   FALSE
10-02-2006  FALSE   FALSE
13-02-2006  FALSE   FALSE
14-02-2006  FALSE   FALSE
15-02-2006  FALSE   FALSE
16-02-2006  FALSE   FALSE
17-02-2006  FALSE   FALSE
20-02-2006  FALSE   FALSE
21-02-2006  FALSE   FALSE
22-02-2006  FALSE   FALSE
23-02-2006  FALSE   FALSE
24-02-2006  FALSE   FALSE
27-02-2006  FALSE   FALSE
28-02-2006  FALSE   FALSE
01-03-2006  FALSE   FALSE
02-03-2006  FALSE   FALSE
03-03-2006  FALSE   FALSE

以下是 dict_dfs['close'] 的示例:

    ABB IN Equity   ABNL IN Equity
01-02-2005  177.02  
02-02-2005  180.08  
03-02-2005  184.78  
04-02-2005  191.27  
07-02-2005  195.48  
08-02-2005  207.81  
09-02-2005  223.48  
10-02-2005  222.94  
11-02-2005  222.65  
14-02-2005  228.7   
15-02-2005  225.64  
16-02-2005  225.1   
17-02-2005  222.4   
18-02-2005  223.48  
21-02-2005  223.66  
22-02-2005  219 
23-02-2005  220.96  
24-02-2005  221.5   
25-02-2005  219.7   
28-02-2005  221.5   
01-03-2005  227.8   
02-03-2005  228.26  
03-03-2005  229.79  
04-03-2005  231.58  
07-03-2005  227.1   
08-03-2005  229.44  
09-03-2005  227.82  
10-03-2005  221.5   
11-03-2005  217.92  
14-03-2005  218.8   
15-03-2005  216.46  
16-03-2005  217.28  
17-03-2005  216.46  
18-03-2005  215.2   
21-03-2005  214.34  
22-03-2005  213.4   
23-03-2005  208.93  
24-03-2005  192.69  
28-03-2005  201.7   
29-03-2005  195.39  
30-03-2005  195.59  
31-03-2005  202.35  
01-04-2005  207.09  
04-04-2005  212.14  
05-04-2005  207.81  
06-04-2005  208.89  
07-04-2005  209.79  
08-04-2005  216.1   
11-04-2005  211.78  
12-04-2005  221.5   
13-04-2005  222.58  
15-04-2005  224.2   
18-04-2005  226.9   
19-04-2005  212.72  
20-04-2005  209.81  
21-04-2005  216.1   
22-04-2005  216.64  
25-04-2005  218.26  
26-04-2005  218.82  
27-04-2005  220.35  
28-04-2005  218.98  
29-04-2005  216.11  
02-05-2005  210.7   
03-05-2005  215.56  
04-05-2005  210.71  
05-05-2005  221.52  
06-05-2005  220.7   
09-05-2005  222.4   
10-05-2005  221.5   
11-05-2005  219.7   
12-05-2005  223.3   
13-05-2005  209.81  
16-05-2005  225.1   
17-05-2005  229.26  
18-05-2005  226.01  
19-05-2005  229.62  
20-05-2005  230.86  
23-05-2005  228.96  
24-05-2005  231.4   
25-05-2005  231.94  
26-05-2005  229.78  
27-05-2005  228.7   
30-05-2005  230.86  
31-05-2005  230.86  
01-06-2005  231.05  
02-06-2005  230.5   
03-06-2005  228.97  
04-06-2005  230.5   
06-06-2005  231.4   
07-06-2005  231.08  
08-06-2005  234.11  
09-06-2005  233.75  
10-06-2005  233.25  
13-06-2005  233.21  
14-06-2005  234.49  
15-06-2005  235.02  
16-06-2005  232.12  
17-06-2005  231.13  
20-06-2005  226.9   
21-06-2005  228.89  
22-06-2005  232.33  
23-06-2005  229.6   
24-06-2005  231.42  
27-06-2005  231.42  
28-06-2005  232.68  
29-06-2005  233.57  
30-06-2005  233.21  
01-07-2005  235.28  
04-07-2005  236.83  
05-07-2005  237.2   
06-07-2005  255.72  
07-07-2005  249.51  
08-07-2005  264.72  
11-07-2005  263.1   
12-07-2005  265.8   
13-07-2005  263.28  
14-07-2005  261.35  
15-07-2005  261.3   
18-07-2005  262.92  
19-07-2005  270.12  
20-07-2005  268.59  
21-07-2005  266.52  
22-07-2005  263.1   
25-07-2005  259.52  
26-07-2005  267.96  
27-07-2005  266.97  
29-07-2005  256.08  
01-08-2005  262.92  
02-08-2005  263.14  
03-08-2005  266.88  
04-08-2005  273.22  
05-08-2005  279.13  
08-08-2005  274.98  
09-08-2005  268.32  
10-08-2005  276.44  
11-08-2005  275.54  
12-08-2005  272.16  
16-08-2005  273.9   
17-08-2005  276.08  
18-08-2005  279.32  
19-08-2005  276.42  
22-08-2005  274.62  
23-08-2005  265.8   
24-08-2005  263.82  
25-08-2005  265.62  
26-08-2005  270.12  
29-08-2005  273.72  
30-08-2005  281.11  
31-08-2005  284.53  
01-09-2005  287.23  
02-09-2005  289.03  
05-09-2005  298.95  
06-09-2005  307.94  
08-09-2005  315.16  
09-09-2005  312.98  
12-09-2005  322.75  
13-09-2005  326.86  
14-09-2005  319.64  
15-09-2005  316.94  
16-09-2005  317.49  
19-09-2005  313.34  
20-09-2005  308.32  
21-09-2005  306.44  
22-09-2005  307.13  
23-09-2005  304.72  
26-09-2005  306.32  
27-09-2005  307.95  
28-09-2005  307.98  
29-09-2005  305.24  
30-09-2005  306.32  
03-10-2005  307.04  
04-10-2005  308.84  
05-10-2005  313.88  
06-10-2005  313.34  
07-10-2005  306.14  
10-10-2005  309.74  
11-10-2005  290.83  
13-10-2005  300.05  
14-10-2005  292.63  
17-10-2005  290.11  
18-10-2005  289.75  
19-10-2005  289.64  
20-10-2005  283.09  
21-10-2005  289.57  
24-10-2005  297.04  
25-10-2005  296.23  
26-10-2005  302.76  
27-10-2005  291.73  
28-10-2005  289.95  
31-10-2005  295.35  
01-11-2005  297.37  
02-11-2005  295.35  
07-11-2005  295.33  
08-11-2005  307.94  
09-11-2005  316.62  
10-11-2005  301.64  
11-11-2005  331.35  
14-11-2005  328.65  
16-11-2005  329.59  
17-11-2005  328.7   
18-11-2005  335.31  
21-11-2005  331.39  
22-11-2005  327.78  
23-11-2005  331.35  
24-11-2005  331.89  
25-11-2005  334.05  
26-11-2005  345.04  
28-11-2005  333.14  
29-11-2005  352.96  
30-11-2005  346.94  
01-12-2005  345.04  
02-12-2005  351.16  
05-12-2005  346.66  
06-12-2005  338.55  
07-12-2005  336.39  
08-12-2005  337.29  
09-12-2005  338.55  
12-12-2005  345.77  
13-12-2005  343.06  
14-12-2005  341.43  
15-12-2005  342.15  
16-12-2005  338.55  
19-12-2005  343.24  
20-12-2005  352.89  
21-12-2005  351.34  
22-12-2005  353.86  
23-12-2005  352.06  
26-12-2005  345.23  
27-12-2005  346.84  
28-12-2005  344.86  
29-12-2005  344.14  
30-12-2005  345.76  
02-01-2006  344.14  
03-01-2006  343.96  
04-01-2006  352.08  
05-01-2006  345.77  
06-01-2006  357.46  
09-01-2006  362.86  
10-01-2006  360.18  
12-01-2006  356.56  
13-01-2006  357.46  
16-01-2006  355.66  
17-01-2006  355.86  
18-01-2006  359.26  
19-01-2006  371.87  
20-01-2006  374.75  
23-01-2006  397.98  
24-01-2006  415.99  
25-01-2006  447.5   
27-01-2006  454 
30-01-2006  441.58  
31-01-2006  447.86  
01-02-2006  438.07  
02-02-2006  439.94  
03-02-2006  432.2   
06-02-2006  427.15  
07-02-2006  426.79  
08-02-2006  436.97  
10-02-2006  430.39  
13-02-2006  437.06  
14-02-2006  430.03  
15-02-2006  443 
16-02-2006  441.42  
17-02-2006  432.41  
20-02-2006  424.09  
21-02-2006  430.39  
22-02-2006  432.02  
23-02-2006  428.95  
24-02-2006  432.2   
27-02-2006  443 
28-02-2006  443.9   
01-03-2006  452.94  
02-03-2006  458.85  
03-03-2006  474.87  

我希望这些信息足以并有助于获得可重复的结果。

python pandas numpy function nested-loops
1个回答
0
投票

这是新优化的功能。它使用列表理解和向量化第一个索引计算。我在你的 4 行数据上进行了测试。希望它能很好地处理您的所有数据。但是,它可以作为您进行更多优化的起点。

def drawdown_vec(result_df, dict_dfs):
    last_year_df = pd.DataFrame(data=np.nan, index=result_df.index, columns=result_df.columns)

    past_dates_idx = np.maximum(0, list(result_df.index - 250))
    past_dates = result_df.index[past_dates_idx]
    current_dates = result_df.index[result_df.index]

    last_years = [
        dict_dfs['close'].loc[past_date:current_date, :] for past_date, current_date in zip(past_dates, current_dates)
    ]
    
    max_dataframes = [
        last_year.iloc[::-1] for last_year in last_years
    ]

    min_dataframes = last_years
    
    rolling_max = [
        max_dataframe.cummax() for max_dataframe in max_dataframes
    ]
    
    rolling_min = [
        max_dataframe.cummin() for max_dataframe in max_dataframes
    ]

    drawdowns = [
        (rolling_max - rolling_min) / rolling_min for rolling_max, rolling_min in zip(max_dataframes, min_dataframes)
    ]

    last_year_df.loc[current_dates] = drawdowns

    return last_year_df

也许你可以将最后 4 个列表推导式放入 for 循环中。您可以使用以下方法比较它们在 ipython 中的性能:

%timeit drawdown_vec(result_df, dict_dfs)
© www.soinside.com 2019 - 2024. All rights reserved.