寻找美股500强每只股票与自身的相关性,如何防止重新计算相关性?

问题描述 投票:0回答:1

这是较大部分代码的一部分,其中仅包括数据下载、数据操作和排序。我怎样才能让它只比较每只股票一次,例如,只比较微软与苹果,而不是微软与苹果以及苹果与微软?

index = 0
ticker_col = np.array([])
compare_col = np.array([])
correlation_col = np.array([])

for ticker in tickers:
    ticker = ticker[:-1]
    for compare in tickers:
        compare = compare[:-1]

        if ticker == compare:
            break
        index += 1

        compare_list = pd.read_csv('C:/stockpredictions/stocks_dfs/' + compare + '.csv')['Adj Close']
        ticker_list = pd.read_csv('C:/stockpredictions/stocks_dfs/' + ticker + '.csv')['Adj Close']

        #pearsons r correlation formula
        correlation = df['Data1'].corr(df['Data2'])
python math correlation stock
1个回答
0
投票

我添加了

enumerate
用于将当前索引存储到
i
变量中(您可以使用它来代替在第二个循环中使用的
index
变量)。 第二个循环现在从
i+1
索引开始,以防止值之间进行两次比较。

index = 0
ticker_col = np.array([])
compare_col = np.array([])
correlation_col = np.array([])

for i, ticker in enumerate(tickers):
    ticker = ticker[:-1]
    for compare in tickers[i+1:]:
        compare = compare[:-1]

        if ticker == compare:
            break
        index += 1

        compare_list = pd.read_csv('C:/stockpredictions/stocks_dfs/' + compare + '.csv')['Adj Close']
        ticker_list = pd.read_csv('C:/stockpredictions/stocks_dfs/' + ticker + '.csv')['Adj Close']

        #pearsons r correlation formula
        correlation = df['Data1'].corr(df['Data2'])

您可以使用以下代码进行测试:

tickers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

for i, ticker in enumerate(tickers):
    for compare in tickers[i+1:]:
        print(ticker, compare)

© www.soinside.com 2019 - 2024. All rights reserved.