Plotly Pandas ta.alma 表现得很奇怪

问题描述 投票:0回答:1

当我一周多前开始使用 Python 时,我从未接触过 Python,我很惊讶由于有很好的库可用,事情可以如此简单地完成。

我正在将我制作的 Tradingview 脚本转换为 Python,并且我想绘制我的指标并比较结果。

在 HA 图表上绘制 alma 指标,结果与电视有很大不同,这并不令人不愉快。

我查了互联网并找到了一些资源并比较了我找到的代码。

# Pandas implementation of ta.alma provides VERY different results compared to TV !!
# Tradingview                => https://www.tradingcode.net/tradingview/arnaud-legoux-average/
# Github bug described       => https://github.com/twopirllc/pandas-ta/pull/374
# Github fix aug 15th, 2021  => https://github.com/twopirllc/pandas-ta/pull/374/commits/752b69e86e19db64cdf161981d0ad8c897efefea
# Original implementation    => https://www.sierrachart.com/SupportBoard.php?PostID=231318#P231318
# Prorealcode implementation => https://www.prorealcode.com/prorealtime-indicators/alma-arnaud-legoux-moving-average/
# What is wrong ?

首先我注意到 Pandas.ta.alma 的计算适用于所有数据,而 TV 和 ProRealPro 代码适用于一个子集和一个“关闭”值。

这就提出了一个问题:“如果我只向数据添加一个新柱,我是否必须计算所有数据?”

如果是这样,那就显得效率低下且耗时。我怎样才能避免这种情况?

但更重要的是“ta.alma 的结果怎么样?”。 是什么原因造成如此大的差异呢?我是否使用旧的 ta-lib,因为它已声明(并已验证)该错误已于 2021 年 8 月 15 日修复? 我如何检查?

从附图中您可以看到电视上的蓝色阿尔玛线如何很好地跟随蜡烛。

但是对于 Pandas ta.alma 和“offeset=0”,看起来 alma 线已经移动了。 我尝试使用“offset=-9”和“offset=-10”,因为我使用“length=9”,但结果仍然与电视相差甚远。 仅当我使用“offset=-5”时,结果几乎与电视相同。 我为什么要使用这个值???

我也尝试过使用和不使用 df.fillna(0) ,但它没有太大区别(除了(开头的令人不满意的行)。

一定有什么地方不对劲,但是到底是什么。 就像它一样,尽管有所有积极的事情,我却无法使用它。 任何帮助将不胜感激。

如果您更改 'barsOffset = -5',我的代码演示了问题

from ib_insync import *
from datetime import datetime

import pandas_ta as ta                      # TA-lib        https://youtu.be/lij39o0_L2I and https://youtu.be/W_kKPp9LEFY
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px

pd.set_option('display.max_rows', None)     # Display all rows
pd.set_option('display.max_columns', 25)
pd.set_option('display.width', 1000)        # equivalent with pd.options.display.width = 1000
pd.options.display.width = 0                # Panda will autodetect the size of the terminal window (and display columns accordingly)
pd.options.display.max_colwidth = 50        # Maximum width of a column (rest will be replaced by ...)

ib = IB()
ib.connect('127.0.0.1', 7497, clientId=0)

duration = "1 D"            # How far back in history should data be retrieved
barSize = "1 min"           # Timeframe of the bars

contract = Index(symbol="EOE", exchange="FTA", currency="EUR")
ib.qualifyContracts(contract)
ticker = contract.symbol

bars = ib.reqHistoricalData(contract,        # the contract of interest
                        endDateTime= '',         # retrieve bars up until current time 
                        keepUpToDate=True,       # keep updating the data requires endDateTime= "")
                        formatDate= 1,           # formatDate=2 => for intraday => UTC
                        durationStr= duration,   # time span of all bars ('1 D')
                        barSizeSetting= barSize, # time period of 1 bar ('1 min')
                        whatToShow= 'trades',    # source of the data ('trades' => actual trades)
                        useRTH= True)            # show only data from Regular Trading Hours

df = util.df(bars)                  # Create a Pandas dataframe from the bars

distAlma = .85                      # Default distribution factor
sigmaAlma = 6                       # Default sigma
barsAlma = 9                        # Use 9 bars for calculating alma
barsOffset = -5                     # Not present in TV; It turns out it must be 5 or 6 to provide results comparable with TV

dfHA = df.ta.ha()                   # Fastest calculation of Heikin Ashi
df = pd.concat([df, dfHA], axis=1)  # Add columns generated by ta.ha to df
df["alma"] = dfHA.ta.alma(close="HA_close", length=barsAlma, sigma= sigmaAlma, offset_distribution= distAlma, offset=barsOffset)
df.fillna(0, inplace=True)          # Replace all NaN with 0 to avoid problems while comparing values but causes drawing lines for the first bars

strategy = 1                        # Discriminates between the type of candles to show combined with the strategy to use

if strategy == 1:
    openPrices = df.HA_open
    highPrices = df.HA_high
    lowPrices = df.HA_low
    closePrices = df.HA_close
    typeCandles = "HA-candles"
else:
    openPrices = df.open
    highPrices = df.high
    lowPrices = df.low
    closePrices = df.close
    typeCandles = "Price"

datePrices = df["date"]

fig = go.Figure()
fig.add_trace(go.Candlestick(name= typeCandles, x= datePrices,
    open= openPrices, high= highPrices, low= lowPrices, close= closePrices))
fig.add_trace(go.Scatter(x=datePrices, y=df.alma, mode='lines', name='alma', line_color="rgb(0,0,255)", fill=None))

maxRange = df.HA_high.max()                   # Maximum price 
minRange = df.HA_low[df.HA_low>0].min()       # Minimum price skipping NaN and zero values
extraRange = 0.05 * (maxRange - minRange)     # Take 5% extra space for the range above and below

layoutFigures = dict ({
                    'title': ticker + " HA Candles",
                    'xaxis_title': "Date",
                    'yaxis_title': "Price",
                    'template': "plotly_dark",                                      # Dark backgroud
                    'dragmode': "pan",                                              # Start in "pan" mode
                    'yaxis_range': [minRange - extraRange, maxRange + extraRange],  # Set range yaxis from 5 % below lowest to 5% aboven highest price 
                    'xaxis_rangeslider_visible': False                              # Hide the zoom slider window
                     })
fig.update_layout(layoutFigures)                # Apply all layout specifications
fig.update_yaxes(fixedrange=False)              # Allow changing the yaxes

configFigures = dict({                          # https://plotly.com/python/configuration-options/#enabling-scroll-zoom
                'scrollZoom': True,             # Use the mousewheel for zooming
                'displayModeBar': True,         # Show the modeBar while hovering over it (False = invisible)
                'displaylogo': False,           # Hide plotly logo from modeBar
                'staticPlot': False,            # Make a dynamic plot (do NOT create a static plot)
                'toImageButtonOptions':
                {   'format': 'jpeg',           # one of png, svg, jpeg, webp  --> jpeg chosen
                    'filename': ticker + "_" + datetime.now().strftime("%Y%m%d-%H%M%S"),   # The name of the ticker will be used as filename
                    'height': 500, 'width': 700, 'scale': 1     # Multiply title/legend/axis/canvas sizes by this factor     
                },
                'modeBarButtonsToRemove': ['lasso2D', 'select2D']
                })
fig.show(config= configFigures)

ib.disconnect()

python pandas plotly
1个回答
0
投票

我找到了一些对 alma 进行编码的参考资料,并实现了这些参考资料以进行比较。

 Pandas implementation of ta.alma provides VERY different results compared to TV !!
# Tradingview                => https://www.tradingcode.net/tradingview/arnaud-legoux-average/
#                               and https://www.tradingview.com/support/solutions/43000594683-arnaud-legoux-moving-average/#:~:text=To%20calculate%20the%20Arnaud%20Legoux,a%20width%20determined%20by%20sigma.
#                               and https://www.tradingview.com/script/rqxaTb6E-ALMA-Function-FN-Arnaud-Legoux-Moving-Average/
# Github bug described       => https://github.com/twopirllc/pandas-ta/pull/374
# Github fix aug 15th, 2021  => https://github.com/twopirllc/pandas-ta/pull/374/commits/752b69e86e19db64cdf161981d0ad8c897efefea
# Original implementation    => https://www.sierrachart.com/SupportBoard.php?PostID=231318#P231318
# Prorealcode implementation => https://www.prorealcode.com/prorealtime-indicators/alma-arnaud-legoux-moving-average/
# What is wrong ?

我还制作了自己的实现并优化了代码以提高速度。事实证明,我使用 TV 代码、ProRealCode 和我的(未优化和优化)实现的实现给出了完全相同的结果,最多可达七位数。 Pandas 的两个代码提供了不同的结果。

我用来调用ta.alma的代码无非是

def AlmaTA(self, df, data= "HA_close"): 

    startTime = time.time()
    for test in range(0, self.counter):
        alma = df.ta.alma(close= data, length= 9, sigma= 6, offset_distribution= 0.85)

    dif = time.time() - startTime
    print("AlmaTA   ", dif)
    return alma

此代码给出的结果与所提供的数据点不相符(请参阅上面的帖子)。在调用 ta.alma 时使用参数“offset”并不能解决问题。事实上我认为这主要是错误的。该指标本身没有参数“offset”。因此 Pandas 将计算结果与输入混合起来进行绘图。事实上,这会引入滞后 if offset < 0 and causes that the last datapoints do not have an alma-value with them. What is the use of a zero-lag indicator like alma if you introduce lagging by using that offset parameter?

我的代码(以及我对 TV 和 ProRealCode 代码的实现)与提供的数据非常吻合,看起来就像我在电视上看到的一样。

我的 ProRealCode 实现是

def AlmaReal(self, df, data= "HA_close", recalc= False):
# Prorealcode implementation => https://www.prorealcode.com/prorealtime-indicators/alma-arnaud-legoux-moving-average/
    startTime = time.time()
    for test in range(0, self.counter):
        prices = df[data]

        m = self.sensitivity * (self.bars - 1)
        s = self.bars / self.sigma

        size = len(prices)                      # Number of prices
        alma = [None] * size                    # Create the array (actually a list as Python has no arrays)
        for pos in range(self.bars-1, size):    # With 9 bars alma[8] can be calculated well
            WtdSum = 0
            CumWt  = 0
            for k in range(0, self.bars):
                Wtd = math.exp(-((k-m)*(k-m))/(2*s*s))
                WtdSum = WtdSum + Wtd * prices[pos - (self.bars - 1 - k)]
                CumWt = CumWt + Wtd
            alma[pos] = WtdSum / CumWt

    dif = time.time() - startTime
    print("AlmaReal   ", dif)
    return alma

我的电视代码实现看起来很相似。当然,我必须添加一个循环来计算数据集的所有行,电视的处理方式非常不同。

我已将所有实现放在一个类中,并可以调用它们来计算单个 alma 值(通过 recalc= False)或计算整个数据集(通过 recalc= True)。我在读取历史数据后使用 recalc=True ,在检索新柱的数据后使用 recalc=False 。

我优化的(也是最快的)代码是

   def alma(self, df, data= "HA_close", recalc= False):
#        startTime = time.time()
#        recalc = True                               # For compatibiliy recalc can be set to false
    for test in range(0, self.counter):
        prices = df[data]
        maxIndex = self.bars - 1                # The maximum index of weight
        size = len(prices)                      # Number of prices
        calc = [None] * size                    # Create the array (actually a list as Python has no arrays)

        if size >= self.bars:                   # Normal situation            
            barStart = maxIndex if recalc else (size - 1) # Calculate just one bar or only the maxIndex bar
            for pos in range(barStart, size):   # Loop stops BEFORE size (because of the range-object used)
                sumAll = 0
                x = pos - maxIndex              # Index of the bar the price is taken into account
                for bar in range(0, self.bars): # What remains inside the loop can be easily calculated in parallel
                    sumAll += prices[x] * self.weight[bar]   # Time consuming exp-call removed from the loop
                    x += 1
                calc[pos] = sumAll              # As weights have been normalized at initialization
#                    print("alma[", pos, "]= ", alma[pos])
        # Handle the bars BEFORE self.bars => COULD be skipped if desired for compatibility reasons
        if (size == 1):                         # First bar 
            calc = prices[0]                    # Best possible result for alma[0] is the price itself
        elif recalc or (size < self.bars):      # Use smaller number of bars for alma until self.bars is possible
            calc[0] = prices[0]                 # Best possible result for alma[0] is the price itself
            for pos in range(1, self.bars-1):   # Calculate alma based on the numbers of bars present
                m = self.sensitivity * (pos - 1) 
                s = min(1, self.sigma / pos)    # Avoids s being larger than the number of bars
                x = -m * s
                sumAll = 0
                sumWeight = 0                   # Weight must be recalculated if the number of bars changed
                for bar in range(0, pos+1):
                    weight = math.exp(-0.5 * x * x)
                    sumWeight += weight
                    sumAll += prices[bar] * weight
                    x = x + s
                calc[pos] = sumAll / sumWeight
#                    print("alma[", pos, "]= ", alma[pos])
#                                 
#        dif = time.time() - startTime
#        print("Alma   ", dif)
    return calc

这使用在实例初始化时计算一次的数据。

我见过的所有代码都不会计算前几根柱的 alma 值,因为还没有足够的柱可用。如果您查看每个 alma 计算的结果,您会发现计算值非常接近最新柱的值(当然:alma 的目标是成为零滞后指标)。对我来说,计算“最佳可能”值并使用它来绘制第一个点是有意义的。这就是为什么我将该功能添加到我的代码中。

类 MyAlma():

# https://capital.com/arnaud-legoux-moving-average#:~:text=ALMA%20is%20a%20technical%20analysis,points%20within%20the%20specified%20period.
def __init__(self, df, bars= 9, sigma= 6, sensitivity= 0.85):                     
    self.bars = bars                        # Number of bars taken into account
    self.sigma = sigma                      # Standard deviation of the Gaussian filter (lower --> more false signals)
    self.sensitivity = sensitivity          # Smoothness of the Gaussian filter (between 0 [noisy] and 1 [lagging])
    self.weight = None                      # How much each bar counts in the calculation

    self.setParameters(bars= self.bars, sigma= self.sigma, sensitivity= self.sensitivity)

    self.counter = 1 #50                      # Just for testing speed different methods
#        df_Alma = self.AlmaTest(df)
#        df_Alma.to_excel("T:/dfAlma.xlsx")

def getParameters(self):
    return [self.bars, self.sigma, self.sensitivity]

def setParameters(self, bars= 9, sigma= 6, sensitivity= 0.85):
    error = (int(bars + 0.00001) < 1) | (sigma < 1) | (sigma > bars) | (sensitivity < 0) | (sensitivity > 1)
    if error == False:
        self.bars = int(bars + 0.00001)         # Floating point protection
        self.sigma = sigma
        self.sensitivity = sensitivity

        # Calculate the weights of each bar
        m = self.sensitivity * (self.bars - 1)
        s = self.sigma / self.bars              # Avoids dividing later
        s = s * math.sqrt(0.5)                  # Math   a * b * b = (sqrt(a) * b) * (sqrt(a) * b)
        x = -m * s                              # Addition instead of multiplication later
        factor = 0
        self.weight = [None] * self.bars        # Idea: calculate this once and store it in the instance
        if self.bars == 1:
            self.weight[0] = 1
        else:
            for bar in range(0, self.bars):     # Calculates 0, 1, 2, 3, 4, 5, 6, 7, 8 if self.bars == 9
                y = math.exp(-x*x)              # Use a local variable instead of referring weight[bar] two times
                factor += y                     # Total sum of all weights
                self.weight[bar] = y            # Weight for this bar
                x = x + s                       # No need for a multiplication here
            factor = 1 / factor                 # Avoid several divides while normalizing
            for bar in range(0, self.bars):
                self.weight[bar] *= factor      # Normalizing weights avoiding multiplication while calculating alma 
    return error
© www.soinside.com 2019 - 2024. All rights reserved.