为什么我的 JOIN 没有从 SUM 中消除 NULL 值?

问题描述 投票:0回答:1

我有以下 SQL 中引用的 2 个表...当它运行时,我收到算术溢出错误:

DECLARE @IssueDate DATETIME = GETDATE();
DECLARE @MaxDate DATETIME = '9999-12-31 23:59:59';

SELECT ROW_NUMBER() OVER(ORDER BY LPT.CustomerId, ISNULL(LPT.ExpiryDate, @MaxDate)) AS Id,
    LPT.CustomerId,
    LPT.ExpiryDate,
    SUM(LPT.PointsValue) AS PointsValue
FROM dbo.LoyaltyPointsTransaction AS LPT
JOIN dbo.Customer AS c ON c.CustomerId = LPT.CustomerId
WHERE (LPT.ExpiryDate IS NULL OR LPT.ExpiryDate > @IssueDate)
GROUP BY LPT.CustomerId,
        LPT.ExpiryDate
HAVING SUM(LPT.PointsValue) <> 0
ORDER BY SUM(LPT.PointsValue) DESC

LoyaltyPointsTransaction 包含具有较大值的行,这些行在求和时可能会生成溢出错误,但它们都有 NULL CustomerId。

Customer 表仅包含具有 CustomerId 的行(无 NULL),因此 JOIN 不应该已经从 LoyaltyPointsTransaction 中过滤掉行吗?

我不明白为什么它在执行 SUM 之前不过滤掉 NULL 行?

sql-server t-sql
1个回答
0
投票

首先,我们假设表格的结构如下,然后我们将用一些合成数据填充它们以确保发生错误。

CREATE TABLE dbo.Customer (
    CustomerId INT PRIMARY KEY, 
    
    CustomerName NVARCHAR(100) NOT NULL,
    Email NVARCHAR(100) NULL,
    CreatedAt DATETIME DEFAULT GETDATE()
);

GO

CREATE TABLE dbo.LoyaltyPointsTransaction (
    TransactionId INT PRIMARY KEY IDENTITY(1,1), 
    CustomerId INT  NULL,
    PointsValue INT NOT NULL, 
    ExpiryDate DATETIME NULL, 
    TransactionDate DATETIME DEFAULT GETDATE()  
   
);


INSERT INTO dbo.Customer (CustomerId, CustomerName, Email) VALUES 
(1, 'John Doe', '[email protected]'),
(2, 'Jane Smith', '[email protected]');
GO
INSERT INTO dbo.LoyaltyPointsTransaction (CustomerId, PointsValue, ExpiryDate) VALUES 
(1, 100, '2025-12-31'), 
(1, 50, NULL), 
(2, 200, '2023-12-31'), 
(2, -50, NULL),
(NULL, 2147483641, NULL),
(NULL, 2147483641, NULL); 
GO 5000

当我们之后再次运行查询时,我们可能会遇到相同的错误,因为我得到了错误。现在,让我们看一下估计的查询计划。 enter image description here

如图所示,查询优化器首先从 LoyaltyPointsTransaction 表中检索数据,并认为最好的方法是在聚合后将其与 Customer 表连接起来。既然这样,就执行了求和操作,没有过滤掉Customer表中的NULL值,导致算术溢出错误。

是的,为了解决这个问题,如果我们将优化器描述为 join 方法,它将首先连接 CustomerLoyaltyPointsTransaction 表,然后执行聚合。所以查询将是这样的:

DECLARE @IssueDate DATETIME = GETDATE();
DECLARE @MaxDate DATETIME = '9999-12-31 23:59:59';

SELECT ROW_NUMBER() OVER(ORDER BY LPT.CustomerId, ISNULL(LPT.ExpiryDate, @MaxDate)) AS Id,
    LPT.CustomerId,
    LPT.ExpiryDate,
    SUM(LPT.PointsValue) AS PointsValue
FROM dbo.LoyaltyPointsTransaction AS LPT
INNER LOOP JOIN dbo.Customer AS c ON c.CustomerId = LPT.CustomerId 

WHERE (LPT.ExpiryDate IS NULL OR LPT.ExpiryDate > @IssueDate)
GROUP BY LPT.CustomerId,
        LPT.ExpiryDate
HAVING SUM(LPT.PointsValue) <> 0
ORDER BY SUM(LPT.PointsValue) DESC

现在让我们看看这个查询计划

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.