我正在创建一些测试数据,这需要我计算一个百分比。
在我的谓词中,我排除了任何会导致零除错误的记录,当我在该数据集上运行SQL查询时,一切运行正常。
生成的记录总数(所有组合):92,345,408
除以零除外的记录总数:92,141,104
当我添加“限定使用案例1”条件时,查询仍然执行而没有错误。但是,当我还在我的谓词中添加“用例2”时,我遇到了除零错误。我不明白这是怎么回事,因为我排除了这个条件:
WHERE CAST(m1.MoneyValue1 AS FLOAT) - CAST(m2.MoneyValue2 AS FLOAT) != 0
下面是我创建3个不同的美元值列(DECIMAL(18,2))的代码,然后我使用CROSS APPLY获取所有可能的组合。
DECLARE @Money1 TABLE
(
ID INT IDENTITY (1,1) NOT NULL,
MoneyValue1 DECIMAL (18,2) NOT NULL
)
DECLARE @Money2 TABLE
(
ID INT IDENTITY (1,1) NOT NULL,
MoneyValue2 DECIMAL (18,2) NOT NULL
)
DECLARE @Money3 TABLE
(
ID INT IDENTITY (1,1) NOT NULL,
MoneyValue3 DECIMAL (18,2) NOT NULL
)
DECLARE @stop DECIMAL(18,2) = 2000.00 -- capping the maximum test value at $2000.00
DECLARE @interval FLOAT = 4.43 -- adding a random dollar amount to create variability and several test values
DECLARE @MoneyValue DECIMAL (18,2) = 0 -- for my test, I don't care about negative dollar amounts
WHILE @MoneyValue < @stop
BEGIN
INSERT INTO @Money1
(
MoneyValue1
)
SELECT CAST(@MoneyValue AS DECIMAL(18,2))
SET @MoneyValue = CAST(@MoneyValue AS FLOAT) + CAST(@interval AS FLOAT)
END
INSERT INTO @Money2 -- use the same values generated by the statement above for my second Money column
(
MoneyValue2
)
SELECT
CAST(MoneyValue1 AS DECIMAL(18,2))
FROM @Money1
INSERT INTO @Money3 -- use the same values generated by the statement above for my second Money column
(
MoneyValue3
)
SELECT
CAST(MoneyValue1 AS DECIMAL(18,2))
FROM @Money1
接下来,我想创建10个随机数据样本; Calc列用于显示用例1的值(请参阅下面的谓词示例中的谓词导致错误)。
SELECT TOP 10
m1.MoneyValue1 AS TotalPmt,
m2.MoneyValue2 AS TotalPmtChange,
m3.MoneyValue3 AS PmtChangeAmount
,CAST(m2.MoneyValue2 AS FLOAT) / (CAST(m1.MoneyValue1 - m2.MoneyValue2 AS FLOAT)) AS Calc
FROM @Money1 AS m1
CROSS APPLY @Money2 AS m2
CROSS APPLY @Money3 AS m3
WHERE CAST(m1.MoneyValue1 AS FLOAT) - CAST(m2.MoneyValue2 AS FLOAT) != 0 -- exclude the possibility of a divide by zero error
ORDER BY NEWID()
如果我将谓词更改为现在也只包括用例1,则再次执行 - 查询执行时没有错误。
WHERE CAST(m1.MoneyValue1 AS FLOAT) - CAST(m2.MoneyValue2 AS FLOAT) != 0 -- exclude the possibility of a divide by zero error
AND CAST(m2.MoneyValue2 AS FLOAT) / (CAST(m1.MoneyValue1 - m2.MoneyValue2 AS FLOAT)) > .1 -- qualify for Use Case 1
ORDER BY NEWID()
但是,如果我将谓词更改为同时包含用例1和用例2条件,我现在将得到除零错误!
WHERE CAST(m1.MoneyValue1 AS FLOAT) - CAST(m2.MoneyValue2 AS FLOAT) != 0 -- exclude the possibility of a divide by zero error
AND CAST(m2.MoneyValue2 AS FLOAT) / (CAST(m1.MoneyValue1 - m2.MoneyValue2 AS FLOAT)) > .1 -- qualify for Use Case 1
AND CAST(m3.MoneyValue3 AS FLOAT) / (CAST(m1.MoneyValue1 - m2.MoneyValue2 AS FLOAT)) > .1 -- qualify for Use Case 2
来自SSMS的消息:
(452 row(s) affected)
Msg 8134, Level 16, State 1, Line 58
Divide by zero error encountered.
虽然我不一定能指出失败的机制,但我可以说,一旦我将数据移动到物理表,就会停止发生除零错误。
一篇支持使用表变量作为原因的帖子:When should I use a table variable vs temporary table in sql server?
也许是因为无法在表变量上创建和运行统计信息导致引擎与除以零记录发生冲突。另一种可能性是SQL Server无法正确查看表变量的基数,即估计从表变量输出的一个记录。
我从上面的链接中发现了一件有趣的事情,其中包含对此链接的引用:What's the difference between a temp table and table variable in SQL Server?
没有列统计信息
具有更准确的表基数并不意味着估计的行数将更准确(除非对表中的所有行执行操作)。 SQL Server根本不维护表变量的列统计信息,因此将依赖于基于比较谓词的猜测(例如,对于非唯一列的=或对于>比较为30%,将返回表的10%) 。相比之下,为#temp表维护了列统计信息。
无论原因如何,我发现的解决方案都回到了原始问题的原因(当我的谓词明确排除了除以零的可能性时遇到零除错误)作为使用包含表变量的副产品数百万条记录。