当我一周多前开始使用 Python 时,我从未接触过 Python,我很惊讶由于有很好的库可用,事情可以如此简单地完成。
我正在将我制作的 Tradingview 脚本转换为 Python,并且我想绘制我的指标并比较结果。
在 HA 图表上绘制 alma 指标,结果与电视有很大不同,这并不令人不愉快。
我查了互联网并找到了一些资源并比较了我找到的代码。
# Pandas implementation of ta.alma provides VERY different results compared to TV !!
# Tradingview => https://www.tradingcode.net/tradingview/arnaud-legoux-average/
# Github bug described => https://github.com/twopirllc/pandas-ta/pull/374
# Github fix aug 15th, 2021 => https://github.com/twopirllc/pandas-ta/pull/374/commits/752b69e86e19db64cdf161981d0ad8c897efefea
# Original implementation => https://www.sierrachart.com/SupportBoard.php?PostID=231318#P231318
# Prorealcode implementation => https://www.prorealcode.com/prorealtime-indicators/alma-arnaud-legoux-moving-average/
# What is wrong ?
首先我注意到 Pandas.ta.alma 的计算适用于所有数据,而 TV 和 ProRealPro 代码适用于一个子集和一个“关闭”值。
这就提出了一个问题:“如果我只向数据添加一个新柱,我是否必须计算所有数据?”
如果是这样,那就显得效率低下且耗时。我怎样才能避免这种情况?
但更重要的是“ta.alma 的结果怎么样?”。 是什么原因造成如此大的差异呢?我是否使用旧的 ta-lib,因为它已声明(并已验证)该错误已于 2021 年 8 月 15 日修复? 我如何检查?
从附图中您可以看到电视上的蓝色阿尔玛线如何很好地跟随蜡烛。
但是对于 Pandas ta.alma 和“offeset=0”,看起来 alma 线已经移动了。 我尝试使用“offset=-9”和“offset=-10”,因为我使用“length=9”,但结果仍然与电视相差甚远。 仅当我使用“offset=-5”时,结果几乎与电视相同。 我为什么要使用这个值???
我也尝试过使用和不使用 df.fillna(0) ,但它没有太大区别(除了(开头的令人不满意的行)。
一定有什么地方不对劲,但是到底是什么。 就像它一样,尽管有所有积极的事情,我却无法使用它。 任何帮助将不胜感激。
如果您更改 'barsOffset = -5',我的代码演示了问题
from ib_insync import *
from datetime import datetime
import pandas_ta as ta # TA-lib https://youtu.be/lij39o0_L2I and https://youtu.be/W_kKPp9LEFY
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px
pd.set_option('display.max_rows', None) # Display all rows
pd.set_option('display.max_columns', 25)
pd.set_option('display.width', 1000) # equivalent with pd.options.display.width = 1000
pd.options.display.width = 0 # Panda will autodetect the size of the terminal window (and display columns accordingly)
pd.options.display.max_colwidth = 50 # Maximum width of a column (rest will be replaced by ...)
ib = IB()
ib.connect('127.0.0.1', 7497, clientId=0)
duration = "1 D" # How far back in history should data be retrieved
barSize = "1 min" # Timeframe of the bars
contract = Index(symbol="EOE", exchange="FTA", currency="EUR")
ib.qualifyContracts(contract)
ticker = contract.symbol
bars = ib.reqHistoricalData(contract, # the contract of interest
endDateTime= '', # retrieve bars up until current time
keepUpToDate=True, # keep updating the data requires endDateTime= "")
formatDate= 1, # formatDate=2 => for intraday => UTC
durationStr= duration, # time span of all bars ('1 D')
barSizeSetting= barSize, # time period of 1 bar ('1 min')
whatToShow= 'trades', # source of the data ('trades' => actual trades)
useRTH= True) # show only data from Regular Trading Hours
df = util.df(bars) # Create a Pandas dataframe from the bars
distAlma = .85 # Default distribution factor
sigmaAlma = 6 # Default sigma
barsAlma = 9 # Use 9 bars for calculating alma
barsOffset = -5 # Not present in TV; It turns out it must be 5 or 6 to provide results comparable with TV
dfHA = df.ta.ha() # Fastest calculation of Heikin Ashi
df = pd.concat([df, dfHA], axis=1) # Add columns generated by ta.ha to df
df["alma"] = dfHA.ta.alma(close="HA_close", length=barsAlma, sigma= sigmaAlma, offset_distribution= distAlma, offset=barsOffset)
df.fillna(0, inplace=True) # Replace all NaN with 0 to avoid problems while comparing values but causes drawing lines for the first bars
strategy = 1 # Discriminates between the type of candles to show combined with the strategy to use
if strategy == 1:
openPrices = df.HA_open
highPrices = df.HA_high
lowPrices = df.HA_low
closePrices = df.HA_close
typeCandles = "HA-candles"
else:
openPrices = df.open
highPrices = df.high
lowPrices = df.low
closePrices = df.close
typeCandles = "Price"
datePrices = df["date"]
fig = go.Figure()
fig.add_trace(go.Candlestick(name= typeCandles, x= datePrices,
open= openPrices, high= highPrices, low= lowPrices, close= closePrices))
fig.add_trace(go.Scatter(x=datePrices, y=df.alma, mode='lines', name='alma', line_color="rgb(0,0,255)", fill=None))
maxRange = df.HA_high.max() # Maximum price
minRange = df.HA_low[df.HA_low>0].min() # Minimum price skipping NaN and zero values
extraRange = 0.05 * (maxRange - minRange) # Take 5% extra space for the range above and below
layoutFigures = dict ({
'title': ticker + " HA Candles",
'xaxis_title': "Date",
'yaxis_title': "Price",
'template': "plotly_dark", # Dark backgroud
'dragmode': "pan", # Start in "pan" mode
'yaxis_range': [minRange - extraRange, maxRange + extraRange], # Set range yaxis from 5 % below lowest to 5% aboven highest price
'xaxis_rangeslider_visible': False # Hide the zoom slider window
})
fig.update_layout(layoutFigures) # Apply all layout specifications
fig.update_yaxes(fixedrange=False) # Allow changing the yaxes
configFigures = dict({ # https://plotly.com/python/configuration-options/#enabling-scroll-zoom
'scrollZoom': True, # Use the mousewheel for zooming
'displayModeBar': True, # Show the modeBar while hovering over it (False = invisible)
'displaylogo': False, # Hide plotly logo from modeBar
'staticPlot': False, # Make a dynamic plot (do NOT create a static plot)
'toImageButtonOptions':
{ 'format': 'jpeg', # one of png, svg, jpeg, webp --> jpeg chosen
'filename': ticker + "_" + datetime.now().strftime("%Y%m%d-%H%M%S"), # The name of the ticker will be used as filename
'height': 500, 'width': 700, 'scale': 1 # Multiply title/legend/axis/canvas sizes by this factor
},
'modeBarButtonsToRemove': ['lasso2D', 'select2D']
})
fig.show(config= configFigures)
ib.disconnect()
我找到了一些对 alma 进行编码的参考资料,并实现了这些参考资料以进行比较。
Pandas implementation of ta.alma provides VERY different results compared to TV !!
# Tradingview => https://www.tradingcode.net/tradingview/arnaud-legoux-average/
# and https://www.tradingview.com/support/solutions/43000594683-arnaud-legoux-moving-average/#:~:text=To%20calculate%20the%20Arnaud%20Legoux,a%20width%20determined%20by%20sigma.
# and https://www.tradingview.com/script/rqxaTb6E-ALMA-Function-FN-Arnaud-Legoux-Moving-Average/
# Github bug described => https://github.com/twopirllc/pandas-ta/pull/374
# Github fix aug 15th, 2021 => https://github.com/twopirllc/pandas-ta/pull/374/commits/752b69e86e19db64cdf161981d0ad8c897efefea
# Original implementation => https://www.sierrachart.com/SupportBoard.php?PostID=231318#P231318
# Prorealcode implementation => https://www.prorealcode.com/prorealtime-indicators/alma-arnaud-legoux-moving-average/
# What is wrong ?
我还制作了自己的实现并优化了代码以提高速度。事实证明,我使用 TV 代码、ProRealCode 和我的(未优化和优化)实现的实现给出了完全相同的结果,最多可达七位数。 Pandas 的两个代码提供了不同的结果。
我用来调用ta.alma的代码无非是
def AlmaTA(self, df, data= "HA_close"):
startTime = time.time()
for test in range(0, self.counter):
alma = df.ta.alma(close= data, length= 9, sigma= 6, offset_distribution= 0.85)
dif = time.time() - startTime
print("AlmaTA ", dif)
return alma
此代码给出的结果与所提供的数据点不相符(请参阅上面的帖子)。在调用 ta.alma 时使用参数“offset”并不能解决问题。事实上我认为这主要是错误的。该指标本身没有参数“offset”。因此 Pandas 将计算结果与输入混合起来进行绘图。事实上,这会引入滞后 if offset < 0 and causes that the last datapoints do not have an alma-value with them. What is the use of a zero-lag indicator like alma if you introduce lagging by using that offset parameter?
我的代码(以及我对 TV 和 ProRealCode 代码的实现)与提供的数据非常吻合,看起来就像我在电视上看到的一样。
我的 ProRealCode 实现是
def AlmaReal(self, df, data= "HA_close", recalc= False):
# Prorealcode implementation => https://www.prorealcode.com/prorealtime-indicators/alma-arnaud-legoux-moving-average/
startTime = time.time()
for test in range(0, self.counter):
prices = df[data]
m = self.sensitivity * (self.bars - 1)
s = self.bars / self.sigma
size = len(prices) # Number of prices
alma = [None] * size # Create the array (actually a list as Python has no arrays)
for pos in range(self.bars-1, size): # With 9 bars alma[8] can be calculated well
WtdSum = 0
CumWt = 0
for k in range(0, self.bars):
Wtd = math.exp(-((k-m)*(k-m))/(2*s*s))
WtdSum = WtdSum + Wtd * prices[pos - (self.bars - 1 - k)]
CumWt = CumWt + Wtd
alma[pos] = WtdSum / CumWt
dif = time.time() - startTime
print("AlmaReal ", dif)
return alma
我的电视代码实现看起来很相似。当然,我必须添加一个循环来计算数据集的所有行,电视的处理方式非常不同。
我已将所有实现放在一个类中,并可以调用它们来计算单个 alma 值(通过 recalc= False)或计算整个数据集(通过 recalc= True)。我在读取历史数据后使用 recalc=True ,在检索新柱的数据后使用 recalc=False 。
我优化的(也是最快的)代码是
def alma(self, df, data= "HA_close", recalc= False):
# startTime = time.time()
# recalc = True # For compatibiliy recalc can be set to false
for test in range(0, self.counter):
prices = df[data]
maxIndex = self.bars - 1 # The maximum index of weight
size = len(prices) # Number of prices
calc = [None] * size # Create the array (actually a list as Python has no arrays)
if size >= self.bars: # Normal situation
barStart = maxIndex if recalc else (size - 1) # Calculate just one bar or only the maxIndex bar
for pos in range(barStart, size): # Loop stops BEFORE size (because of the range-object used)
sumAll = 0
x = pos - maxIndex # Index of the bar the price is taken into account
for bar in range(0, self.bars): # What remains inside the loop can be easily calculated in parallel
sumAll += prices[x] * self.weight[bar] # Time consuming exp-call removed from the loop
x += 1
calc[pos] = sumAll # As weights have been normalized at initialization
# print("alma[", pos, "]= ", alma[pos])
# Handle the bars BEFORE self.bars => COULD be skipped if desired for compatibility reasons
if (size == 1): # First bar
calc = prices[0] # Best possible result for alma[0] is the price itself
elif recalc or (size < self.bars): # Use smaller number of bars for alma until self.bars is possible
calc[0] = prices[0] # Best possible result for alma[0] is the price itself
for pos in range(1, self.bars-1): # Calculate alma based on the numbers of bars present
m = self.sensitivity * (pos - 1)
s = min(1, self.sigma / pos) # Avoids s being larger than the number of bars
x = -m * s
sumAll = 0
sumWeight = 0 # Weight must be recalculated if the number of bars changed
for bar in range(0, pos+1):
weight = math.exp(-0.5 * x * x)
sumWeight += weight
sumAll += prices[bar] * weight
x = x + s
calc[pos] = sumAll / sumWeight
# print("alma[", pos, "]= ", alma[pos])
#
# dif = time.time() - startTime
# print("Alma ", dif)
return calc
这使用在实例初始化时计算一次的数据。
我见过的所有代码都不会计算前几根柱的 alma 值,因为还没有足够的柱可用。如果您查看每个 alma 计算的结果,您会发现计算值非常接近最新柱的值(当然:alma 的目标是成为零滞后指标)。对我来说,计算“最佳可能”值并使用它来绘制第一个点是有意义的。这就是为什么我将该功能添加到我的代码中。
类 MyAlma():
# https://capital.com/arnaud-legoux-moving-average#:~:text=ALMA%20is%20a%20technical%20analysis,points%20within%20the%20specified%20period.
def __init__(self, df, bars= 9, sigma= 6, sensitivity= 0.85):
self.bars = bars # Number of bars taken into account
self.sigma = sigma # Standard deviation of the Gaussian filter (lower --> more false signals)
self.sensitivity = sensitivity # Smoothness of the Gaussian filter (between 0 [noisy] and 1 [lagging])
self.weight = None # How much each bar counts in the calculation
self.setParameters(bars= self.bars, sigma= self.sigma, sensitivity= self.sensitivity)
self.counter = 1 #50 # Just for testing speed different methods
# df_Alma = self.AlmaTest(df)
# df_Alma.to_excel("T:/dfAlma.xlsx")
def getParameters(self):
return [self.bars, self.sigma, self.sensitivity]
def setParameters(self, bars= 9, sigma= 6, sensitivity= 0.85):
error = (int(bars + 0.00001) < 1) | (sigma < 1) | (sigma > bars) | (sensitivity < 0) | (sensitivity > 1)
if error == False:
self.bars = int(bars + 0.00001) # Floating point protection
self.sigma = sigma
self.sensitivity = sensitivity
# Calculate the weights of each bar
m = self.sensitivity * (self.bars - 1)
s = self.sigma / self.bars # Avoids dividing later
s = s * math.sqrt(0.5) # Math a * b * b = (sqrt(a) * b) * (sqrt(a) * b)
x = -m * s # Addition instead of multiplication later
factor = 0
self.weight = [None] * self.bars # Idea: calculate this once and store it in the instance
if self.bars == 1:
self.weight[0] = 1
else:
for bar in range(0, self.bars): # Calculates 0, 1, 2, 3, 4, 5, 6, 7, 8 if self.bars == 9
y = math.exp(-x*x) # Use a local variable instead of referring weight[bar] two times
factor += y # Total sum of all weights
self.weight[bar] = y # Weight for this bar
x = x + s # No need for a multiplication here
factor = 1 / factor # Avoid several divides while normalizing
for bar in range(0, self.bars):
self.weight[bar] *= factor # Normalizing weights avoiding multiplication while calculating alma
return error