我有一个数据集,其中有与发票编号关联的不同提交 ID 和提交日期时间。我试图为每个发票编号仅选择具有最近提交日期时间的行。示例数据如下,期望的结果是仅返回第 2 行
我尝试过的示例查询如下:
SELECT TOP 1000 S.SubissionID
,BookingID
,InvoiceNumber
,InvoiceDate
,NumberofServices
,SubmissionDateTime
FROM [dbo].[Submission] S
JOIN [dbo].[SubmissionInvoice] SI on S.SubmissionID = SI.SubmissionID
INNER JOIN (SELECT MAX(SubmissionDateTime maxdate FROM [dbo].[Submission] m on m.maxdate = s.SubmissionDateTime
这仅返回具有最新提交日期时间的记录,无论发票编号或提交ID如何。
SELECT TOP 1000 S.SubissionID
,BookingID
,InvoiceNumber
,InvoiceDate
,NumberofServices
,SubmissionDateTime
FROM [dbo].[Submission] S
JOIN [dbo].[SubmissionInvoice] SI on S.SubmissionID = SI.SubmissionID
and SubmissionDateTime = (
SELECT MAX (SubmissionDateTime)
FROM [dbo].[Submission] as B
WHERE S.SubmissionID = B.SubmissionID)
这个最大日期过滤似乎没有做任何事情,结果看起来就像我根本没有包含查询的那部分一样。
如果您的 RDBM 支持,您可以使用像 row_number() 这样的窗口函数。这是一个例子
SELECT SubissionID
,BookingID
,InvoiceNumber
,InvoiceDate
,NumberofServices
,SubmissionDateTime
FROM (
SELECT S.SubissionID
,BookingID
,InvoiceNumber
,InvoiceDate
,NumberofServices
,SubmissionDateTime
,row_number() over(partition by InvoiceNumber order by SubmissionDateTime DESC) AS RN
FROM [dbo].[Submission] S
JOIN [dbo].[SubmissionInvoice] SI
on S.SubmissionID = SI.SubmissionID) t
WHERE RN = 1