SQL Server 递归 CTE - 获取所有亲属的所有亲属

问题描述 投票:0回答:1

我有一些带有 ID 的数据,我需要将其识别为相关的。我无法识别亲戚的亲戚(以及亲戚的亲戚等等......)。

*部分问题在于数据实际上不是分层的。根据各个字段匹配(驾驶执照#、ssn 等),它们被识别为相关。我布置了数据,以便 Parent 始终是较低的 ID #,这样我们就不会在递归中遇到任何无限循环。

样本数据:

drop table if exists #AllSimilarMinParent 
create table #AllSimilarMinParent (ParentId bigint, ChildId bigint)
insert into #AllSimilarMinParent (ParentId, ChildId) values (10, 20)
insert into #AllSimilarMinParent (ParentId, ChildId) values (20, 30)
insert into #AllSimilarMinParent (ParentId, ChildId) values (30, 40)
insert into #AllSimilarMinParent (ParentId, ChildId) values (40, 50)
insert into #AllSimilarMinParent (ParentId, ChildId) values (39, 40)
insert into #AllSimilarMinParent (ParentId, ChildId) values (39, 51)
insert into #AllSimilarMinParent (ParentId, ChildId) values (49, 51)
insert into #AllSimilarMinParent (ParentId, ChildId) values (49, 61)
insert into #AllSimilarMinParent (ParentId, ChildId) values (59, 61)
insert into #AllSimilarMinParent (ParentId, ChildId) values (59, 71)

我正在使用递归 CTE 来获取每个 ID 的层次结构:

    WITH RelationHierarchy as (
    SELECT  ChildId,  ParentId
    FROM #AllSimilarMinParent MP
)
, RCTE AS
(
    --recursive CTE to generate hierarchy
    SELECT  ParentId, ChildId, 1 AS Lvl FROM RelationHierarchy 

    UNION ALL

    SELECT rh.ParentId, rc.ChildId, Lvl+1 AS Lvl 
    FROM RelationHierarchy rh
        INNER JOIN RCTE rc ON rh.ChildId = rc.ParentId
)
select F0.ParentId, F0.ChildId, F0.Lvl FROM RCTE F0 ORDER BY ChildId, ParentId

返回以下内容:

家长ID 孩子ID Lvl
10 20 1
10 30 2
20 30 1
10 40 3
20 40 2
30 40 1
39 40 1
10 50 4
20 50 3
30 50 2
39 50 2
40 50 1
39 51 1
49 51 1
49 61 1
59 61 1
59 71 1

我也尝试加入 CTE 以获取所有亲戚的亲戚:

        select F0.ParentId, F0.ChildId, F0.Lvl FROM RCTE F0
        UNION
        select F1.ChildId, F2.ChildId, -1 FROM RCTE F1 JOIN RCTE F2 ON F1.ParentId = F2.ParentId AND F1.ChildId <> F2.ChildId
        UNION
        select F2.ParentId, F1.ParentId, -2 FROM RCTE F1 JOIN RCTE F2 ON F1.ChildId = F2.ChildId AND F1.ParentId <> F2.ParentId

..这在一定程度上有所帮助,但我仍然没有得到我正在寻找的每一种关系。例如,我无法将 1071 相关。这个特定数据的关联方式,似乎我必须多次向上(到父级)和向下(到子级)才能从:

71 至 59

59 降至 61

61 至 49

49 降至 51

51 至 39

39 降至 40

40 至 30

30 至 20

20 至 10

我考虑过在另一个递归 CTE 中使用第一个递归 CTE 的结果,但事实上它必须随机向上和向下(即向上然后向下,而不是始终如一的一个方向或另一个方向)让我难住了。

有什么想法可以明确返回相关记录的每个排列吗?该特定数据集中的每个 ID 都应与其他每个 ID 相关。

*编辑-下面是所需的输出,显示每个 ID 彼此相关

家长ID 孩子ID
10 20
10 30
10 39
10 40
10 49
10 50
10 51
10 59
10 61
10 71
20 30
20 39
20 40
20 49
20 50
20 51
20 59
20 61
20 71
30 39
30 40
30 49
30 50
30 51
30 59
30 61
30 71
39 40
39 49
39 50
39 51
39 59
39 61
39 71
40 49
40 50
40 51
40 59
40 61
40 71
49 50
49 51
49 59
49 61
49 71
50 51
50 59
50 61
50 71
51 59
51 61
51 71
59 61
59 71
61 71
sql-server recursion common-table-expression
1个回答
0
投票

我能够调整 siggemannen 上面发布的答案:

如何将组标识符组合成单个组

..它确实给出了预期的结果。然而,性能却行不通。对于我的样本数据中的 11 条记录,花费了 11 秒。我又添加了 11 条记录,这些记录与我最初的 11 条记录没有任何关系,查询花费了超过一分钟的时间。当我添加 1 个与第一组中的一条记录相关的记录到第二组中的一条记录时(意味着所有 22 个 ID 现在都以某种方式相互关联),查询已经运行了半个多小时。在我的现实世界应用程序中,我预计需要处理数万条记录。

(有效但速度缓慢)解决方案:

--create relations
    drop table if exists #AllSimilarMinParent 
    create table #AllSimilarMinParent (ParentId bigint, ChildId bigint)
    insert into #AllSimilarMinParent (ParentId, ChildId) values (10, 20)
    insert into #AllSimilarMinParent (ParentId, ChildId) values (20, 30)
    insert into #AllSimilarMinParent (ParentId, ChildId) values (30, 40)
    insert into #AllSimilarMinParent (ParentId, ChildId) values (40, 50)
    insert into #AllSimilarMinParent (ParentId, ChildId) values (39, 40)
    insert into #AllSimilarMinParent (ParentId, ChildId) values (39, 51)
    insert into #AllSimilarMinParent (ParentId, ChildId) values (49, 51)
    insert into #AllSimilarMinParent (ParentId, ChildId) values (49, 61)
    insert into #AllSimilarMinParent (ParentId, ChildId) values (59, 61)
    insert into #AllSimilarMinParent (ParentId, ChildId) values (59, 71)

--build recursive CTE and put results into temp table
    drop table if exists #RCTE
    ;
    WITH RelationHierarchy as (
        SELECT  ChildId,  ParentId
        FROM #AllSimilarMinParent MP
    )
    , RCTE AS
    (
        --recursive CTE to generate hierarchy
        SELECT  ParentId, ChildId, 1 AS Lvl FROM RelationHierarchy 

        UNION ALL

        SELECT rh.ParentId, rc.ChildId, Lvl+1 AS Lvl 
        FROM RelationHierarchy rh
            INNER JOIN RCTE rc ON rh.ChildId = rc.ParentId
    )
        select F0.ParentId, F0.ChildId, F0.Lvl 
        INTO #RCTE 
        FROM RCTE F0
        select * FROM #RCTE

--create all combinations (both directions)
    drop table if exists #mytable2
    create table #mytable2 (id int, groupid int)
    insert into #mytable2 (id, groupid) 
    select ParentId, ChildId FROM #RCTE 
    UNION
    select ChildId, ParentId FROM #RCTE 
    UNION
    select ParentId, ParentId FROM #RCTE --also needed every ID to belog to itself as a group
    UNION
    select ChildId, ChildId FROM #RCTE  --also needed every ID to belog to itself as a group


--suggested solution adapted from https://stackoverflow.com/questions/76272634/how-can-i-combine-group-identifiers-into-single-group?answertab=scoredesc#tab-top
    ;with 
        uniquenodes as (select distinct id from #mytable2)
        , nodes as (
            select t.id, v.grp
            from uniquenodes t
            cross apply ( select groupid from #mytable2 t1 where t1.id = t.id ) v(grp)
        )
        ,
        edges as (
            select distinct n1.id as id1, n2.id as id2
            from nodes n1
            inner join nodes n2 on n1.grp = n2.grp
        )
        ,
        rec as (
            select id1, id2, cast(id1 as nvarchar(max)) as visited from edges
            union all
            select r.id1, e.id2, concat(r.visited, ',', e.id2)
            from rec r
            inner join edges e on e.id1 = r.id2
            where concat(',', r.visited, ',') not like concat('%,', e.id2, ',%')
        )
        ,
        fin as (
            select id1, min(value) min_id
            from rec r
            cross apply string_split(r.visited, ',')
            group by id1
        )
    select id1 as id, dense_rank() over(order by min_id) grp
    from fin f

它把所有11个ID都放在第1组中,这确实满足了我的需求。初步测试表明,操作 ID 可以达到预期的结果(如果我孤立 ID 或添加仅彼此相关的新 ID,则会创建第二组),但由于性能问题,我还没有进行过多的测试。

© www.soinside.com 2019 - 2024. All rights reserved.