如果这个查询:
SELECT CONCAT(SOURCE.OrderNo, '_', SOURCE.OrderLine),
SOURCE.ProdOrder,
SOURCE.Lvl1,
SOURCE.Lvl2,
SOURCE.Lvl3,
SOURCE.LastDate
FROM dbo.SourceTbl AS SOURCE
返回11条记录和此查询:
SELECT CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine),
TARGET.ProdOrder,
TARGET.Lvl1,
TARGET.Lvl2,
TARGET.Lvl3,
TARGET.LastDate
FROM dbo.TargetTbl AS TARGET
返回17条记录和两者之间的INTERSECT:
SELECT CONCAT(SOURCE.OrderNo, '_', SOURCE.OrderLine),
SOURCE.ProdOrder,
SOURCE.Lvl1,
SOURCE.Lvl2,
SOURCE.Lvl3,
SOURCE.LastDate
FROM dbo.SourceTbl AS SOURCE
INTERSECT
SELECT CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine),
TARGET.ProdOrder,
TARGET.Lvl1,
TARGET.Lvl2,
TARGET.Lvl3,
TARGET.LastDate
FROM dbo.TargetTbl AS TARGET
返回9条记录,当我像这样做MERGE时:
MERGE dbo.TargetTbl AS TARGET
USING (
SELECT OrderNo, OrderLine, CONCAT(OrderNo, '_', OrderLine) AS OrderNoLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3,
MAX(LastDate) AS LastDate
FROM dbo.SourceTbl
GROUP BY OrderNo, OrderLine, CONCAT(OrderNo, '_', OrderLine), SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3
) AS SOURCE
ON CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine) = OrderNoLine
AND TARGET.ProdOrder = SOURCE.ProdOrder
AND TARGET.Lvl1 = SOURCE.Lvl1
AND TARGET.Lvl2 = SOURCE.Lvl2
AND TARGET.Lvl3 = SOURCE.Lvl3
AND TARGET.LastDate = SOURCE.LastDate
WHEN MATCHED AND EXISTS (SELECT CONCAT(SOURCE.OrderNo, '_', SOURCE.OrderLine)
,SOURCE.ProdOrder
,SOURCE.Lvl1
,SOURCE.Lvl2
,SOURCE.Lvl3
,SOURCE.LastDate
INTERSECT
SELECT CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine)
,TARGET.ProdOrder
,TARGET.Lvl1
,TARGET.Lvl2
,TARGET.Lvl3
,TARGET.LastDate
)
THEN UPDATE SET TARGET.IsBlocked = 1, TARGET.BlockDate = GETDATE()
WHEN NOT MATCHED BY TARGET
THEN INSERT (LastDate, UsrID, DepID, OrderNo, OrderLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, IsBlocked, BlockDate)
VALUES (SOURCE.LastDate, 999, 999, SOURCE.OrderNo, SOURCE.OrderLine, SOURCE.SomeModel, SOURCE.ProdOrder, SOURCE.Lvl1, SOURCE.Lvl2, SOURCE.Lvl3, 1, GETDATE());
根据this和this的说法,它应该将TargetTbl和INSERT的9个INTERSECT记录更新到同一个表中来自SourceTbl的剩余2个记录(总共11个)。相反,它更新4条记录并插入6条记录(总共10条记录)。 SourceTbl中的两个记录是重复的,这是10而不是11的原因,这也是我使用MAX&GROUP BY的原因。
我认为这是查询的第一部分,USING部分,即使INTERSECT部分完成其工作也无法正确处理NULL。我尽我所能,但没有成功。我确信这很容易实现,所以请帮助我。谢谢。
编辑:使用SELECT OrderNo, OrderLine, CONCAT(OrderNo, '_', OrderLine) AS OrderNoLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, LastDate AS LastDate FROM dbo.SourceTbl ORDER BY OrderNo, OrderLine, SomeModel, ProdOrder
的SourceTbl数据,省略了无关的列:
OrderNo OrderLine OrderNoLine SomeModel ProdOrder Lvl1 Lvl2 Lvl3 LastDate
123c08637 10 123c08637_10 4321525175_004321 A5C008837 Abcd Efgh Olol 04/03/2030
123c11214 10 123c11214_10 4321532622_000391 NULL NULL NULL NULL 07/07/2018
123c13039 10 123c13039_10 4321525175_002611 A5C014838 NULL NULL NULL 18/05/2018
123c16059 10 123c16059_10 4321541488_001111 A5C018611 NULL NULL NULL 18/05/2018
123c17482 10 123c17482_10 4321506480_001711 A5C019227 Asdf Ghjk Cvnm 12/12/2018
123c17482 10 123c17482_10 4321506480_001711 A5C047712 Asdf Ghjk Cvnm 12/12/2018
123c17482 20 123c17482_20 4321506480_001712 A5B072554 aaaa bbbb cccc 18/05/2018
123c17482 20 123c17482_20 4321506480_001712 A5B072554 aaaa bbbb cccc 18/05/2018
123c17482 20 123c17482_20 4321506480_001712 A5B072554 aaaa bbbb xxxx 18/05/2018
123c17482 20 123c17482_20 4321506480_001712 A5B200472 NULL NULL NULL 18/05/2018
123c32405 10 123c32405_10 8765525667_005301 NULL Qwer Uiop Tygh 12/12/2018
GROUP BY可能会将记录数减少到只有一个(如果11条记录仅在LastDate列中有所不同,并且SomeModel包含所有11条记录的相同值),或者它可能导致所有11条记录(如果SomeModel包含唯一值) ,以便GROUP BY不会重新生成10个不同的行。要实现此目的,请使用SELECT DISTINCT而不是按列的子集进行分组。
此外,如果ON条件按预期工作,则额外的EXISTS条件已过时。显然,找到4个匹配,6个记录不匹配。在这6个中,可能有2个记录确实没有匹配,4个记录由于NULL值而不匹配。
为了处理NULL值,我建议将整个语句更改为:
MERGE dbo.TargetTbl AS TARGET
USING (
SELECT DISTINCT OrderNo, OrderLine, ProdOrder, Lvl1, Lvl2, Lvl3, LastDate
FROM dbo.SourceTbl
) AS SOURCE
ON (TARGET.OrderNo = SOURCE.OrderNo OR TARGET.OrderNo IS NULL AND SOURCE.OrderNo IS NULL)
AND (TARGET.OrderLine = SOURCE.OrderLine OR TARGET.OrderLine IS NULL AND SOURCE.OrderLine IS NULL)
AND (TARGET.ProdOrder = SOURCE.ProdOrder OR TARGET.ProdOrder IS NULL AND SOURCE.ProdOrder IS NULL)
AND (TARGET.Lvl1 = SOURCE.Lvl1 OR TARGET.Lvl1 IS NULL AND SOURCE.Lvl1 IS NULL)
AND (TARGET.Lvl2 = SOURCE.Lvl2 OR TARGET.Lvl2 IS NULL AND SOURCE.Lvl2 IS NULL)
AND (TARGET.Lvl3 = SOURCE.Lvl3 OR TARGET.Lvl3 IS NULL AND SOURCE.Lvl3 IS NULL)
AND (TARGET.LastDate = SOURCE.LastDate OR TARGET.LastDate IS NULL AND SOURCE.LastDate IS NULL)
WHEN MATCHED
THEN UPDATE SET TARGET.IsBlocked = 1, TARGET.BlockDate = GETDATE()
WHEN NOT MATCHED BY TARGET
THEN INSERT (LastDate, UsrID, DepID, OrderNo, OrderLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, IsBlocked, BlockDate)
VALUES (LastDate, 999, 999, OrderNo, OrderLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, 1, GETDATE());
SQL语言的一些特性使用了一种独特的概念(特别是DISTINCT
和GROUP BY
),其中值得注意的是NULL IS NOT DISTINCT FROM NULL
是真的。这也出现在UNION (ALL)
,EXCEPT
,INTERSECT
等。
不幸的是,SQL Server还没有从标准SQL中实现IS (NOT) DISTINCT FROM
操作符,在iteself中;因此,您将继续使用等式比较,其中着名的SQL,NULL = NULL
未知(不是真或假)。因此,您必须在NULL
子句中显式执行ON
检查(直到SQL Server的未来版本支持DISTINCT FROM
运算符)