根据限制规则对交易进行分组

问题描述 投票:0回答:1

我需要编写一个 SQL 脚本,根据以下规则对每个客户端的事务进行分组

  1. 每个组的交易限额为 500000 笔或 365 天,以先到者为准

  2. 我有三栏

    transaction_id
    transaction_date
    client_id

  3. 结果必须显示

    group_id
    client_id
    number_of_transaction_in_group
    group_start_date
    group_end_date

我尝试了 ChatGPT 中的这个脚本,但它没有返回正确的结果,如下所示:

客户端ID 组ID 群组交易 群组开始日期 组结束日期
8 0 33101 2022-08-14 2023-08-13
8 1 966899 2023-08-14 2024-05-07
8 2 500000 2024-05-07 2024-08-12
8 3 417142 2024-08-12 2024-11-27

预期的结果应该是这样的

客户端ID 组ID 群组交易 群组开始日期 组结束日期
8 0 33101 2022-08-14 2023-08-13
8 1 500000 2023-08-14 2024-05-07
8 2 500000 2024-05-07 2024-08-12
8 3 417142 2024-08-12 2025-08-12
8 4 300000 2025-08-13 2026-08-12

代码:

WITH NumberedTransactions AS 
(
   -- Assign a row number for each transaction per client, ordered by date
   SELECT 
       ClientId,
       TransactionId,
       TransactionDate,
       ROW_NUMBER() OVER (PARTITION BY ClientId ORDER BY TransactionDate) AS RowNum
   FROM 
       Transactions
),
GroupsByTransactionCount AS 
(
   -- Group transactions into sets of 5 based on RowNum
   SELECT
       ClientId,
       TransactionId,
       TransactionDate,
       (RowNum - 1) / 5 AS TransactionGroup
   FROM 
       NumberedTransactions
),
GroupsByDate AS  
(
   -- Assign a start date for each 365-day window for each client
   SELECT
       ClientId,
       TransactionId,
       TransactionDate,
       DATEDIFF(DAY, MIN(TransactionDate) OVER (PARTITION BY ClientId), TransactionDate) / 365 AS DateGroup
   FROM 
       NumberedTransactions
),
FinalGroups AS 
(
   -- Combine both grouping methods into one
   SELECT
       ClientId,
       TransactionId,
       TransactionDate,
       TransactionGroup,
       DateGroup,
       -- Use the larger group number to ensure both conditions are met
       CASE 
           WHEN TransactionGroup >= DateGroup 
               THEN TransactionGroup
           ELSE DateGroup
        END AS FinalGroup
    FROM
        GroupsByTransactionCount 
    INNER JOIN 
        GroupsByDate ON GroupsByTransactionCount.ClientId = GroupsByDate.ClientId
                     AND GroupsByTransactionCount.TransactionId = GroupsByDate.TransactionId
)
SELECT 
    ClientId,
    FinalGroup,
    COUNT(*) AS TransactionsInGroup,
    MIN(TransactionDate) AS GroupStartDate,
    MAX(TransactionDate) AS GroupEndDate
FROM 
    FinalGroups
GROUP BY 
    ClientId, FinalGroup
ORDER BY 
    ClientId, FinalGroup;
sql sql-server stored-procedures view sql-server-2012
1个回答
0
投票

这里尝试演示 2 个用例,两个客户端都有 6 个事务

  • 客户 1 有一些太旧了,无法考虑
  • 客户端 2 的行数过多。

下面的行数限制为 5(而不是 500,000,但在针对实际数据使用时可以更改该限制)。请注意,RowNum 按日期降序排列,以便保留“最新”行。

CREATE TABLE Transactions (
    transaction_id  INT,
    transaction_date    Date,
    client_id   VARCHAR(512)
);

INSERT INTO Transactions (transaction_id, transaction_date, client_id) VALUES
  -- only 3 rows within the 365 days
    ('1', '2022-12-12', '1'), -- too old
    ('2', '2023-02-02', '1'), -- too old
    ('3', '2023-04-04', '1'),  -- too old
    ('4', '2023-10-13', '1'),
    ('5', '2023-11-11', '1'),
    ('6', '2024-10-12', '1'), -- most recent

  -- all rows within 365 days, but limit to 5 rows
    ('7', '2024-01-12', '2'), 
    ('8', '2024-02-02', '2'),
    ('9', '2024-04-04', '2'),
    ('10', '2024-07-12', '2'),
    ('11', '2024-09-01', '2'),
    ('12', '2024-10-12', '2'); -- most recent


SELECT
    client_id
  , dateadd(day,-365,Max(transaction_date)) as min_date
  , Max(transaction_date) as most_recent_date
  , count(*) as rows_in_group
FROM Transactions
GROUP BY client_id
client_id 最小日期 最近日期 组中的行数
1 2023-10-13 2024-10-12 6
2 2023-10-13 2024-10-12 6
select
  d.*
from (
    select
         t.client_id, t.transaction_date, g.min_date
       , ROW_NUMBER() OVER (PARTITION BY t.Client_Id ORDER BY t.Transaction_Date DESC) AS RowNum
    from Transactions as t
    inner join (
           SELECT
               client_id
             , dateadd(day,-365,Max(transaction_date)) as min_date
           FROM Transactions
           GROUP BY client_id
           ) as g on t.client_id = g.client_id
      ) as d
where d.RowNum <= 5 and transaction_date >= min_date
client_id 交易日期 最小日期 行数
1 2024-10-12 2023-10-13 1
1 2023-11-11 2023-10-13 2
1 2023-10-13 2023-10-13 3
2 2024-10-12 2023-10-13 1
2 2024-09-01 2023-10-13 2
2 2024-07-12 2023-10-13 3
2 2024-04-04 2023-10-13 4
2 2024-02-02 2023-10-13 5

小提琴

© www.soinside.com 2019 - 2024. All rights reserved.