在 SQL Server 中生成排列的最优雅方法

问题描述 投票:0回答:12

给出下表:

Index | Element
---------------
  1   |    A
  2   |    B
  3   |    C
  4   |    D

我们希望使用元素生成所有可能的排列(不重复)。 最终结果(跳过一些行)将如下所示:

  Results
----------
   ABCD
   ABDC
   ACBD
   ACDB
   ADAC
   ADCA

   ...

   DABC
   DACB
   DBCA
   DBAC
   DCAB
   DCBA

  (24 Rows)

你会怎么做?

sql sql-server t-sql permutation
12个回答
12
投票

在发表了一些也许尖刻的评论之后,这个问题整个晚上都在我的脑海里盘旋,我最终想出了以下基于集合的方法。我相信它绝对符合“优雅”的标准,但我也认为它符合“有点愚蠢”的标准。 你拨打电话。

首先,设置一些表格:

--  For testing purposes
DROP TABLE Source
DROP TABLE Numbers
DROP TABLE Results


--  Add as many rows as need be processed--though note that you get N! (number of rows, factorial) results,
--  and that gets big fast. The Identity column must start at 1, or the algorithm will have to be adjusted.
--  Element could be more than char(1), though the algorithm would have to be adjusted again, and each element
--  must be the same length.
CREATE TABLE Source
 (
   SourceId  int      not null  identity(1,1)
  ,Element   char(1)  not null
 )

INSERT Source (Element) values ('A')
INSERT Source (Element) values ('B')
INSERT Source (Element) values ('C')
INSERT Source (Element) values ('D')
--INSERT Source (Element) values ('E')
--INSERT Source (Element) values ('F')


--  This is a standard Tally table (or "table of numbers")
--  It only needs to be as long as there are elements in table Source
CREATE TABLE Numbers (Number int not null)
INSERT Numbers (Number) values (1)
INSERT Numbers (Number) values (2)
INSERT Numbers (Number) values (3)
INSERT Numbers (Number) values (4)
INSERT Numbers (Number) values (5)
INSERT Numbers (Number) values (6)
INSERT Numbers (Number) values (7)
INSERT Numbers (Number) values (8)
INSERT Numbers (Number) values (9)
INSERT Numbers (Number) values (10)


--  Results are iteratively built here. This could be a temp table. An index on "Length" might make runs
--  faster for large sets.  Combo must be at least as long as there are characters to be permuted.
CREATE TABLE Results
 (
   Combo   varchar(10)  not null
  ,Length  int          not null
 )

例程如下:

SET NOCOUNT on

DECLARE
  @Loop     int
 ,@MaxLoop  int


--  How many elements there are to process
SELECT @MaxLoop = max(SourceId)
 from Source


--  Initialize first value
TRUNCATE TABLE Results
INSERT Results (Combo, Length)
 select Element, 1
  from Source
  where SourceId = 1

SET @Loop = 2

--  Iterate to add each element after the first
WHILE @Loop <= @MaxLoop
 BEGIN

    --  See comments below. Note that the "distinct" remove duplicates, if a given value
    --  is to be included more than once
    INSERT Results (Combo, Length)
     select distinct
        left(re.Combo, @Loop - nm.Number)
        + so.Element
        + right(re.Combo, nm.Number - 1)
       ,@Loop
      from Results re
       inner join Numbers nm
        on nm.Number <= @Loop
       inner join Source so
        on so.SourceId = @Loop
      where re.Length = @Loop - 1

    --  For performance, add this in if sets will be large
    --DELETE Results
    -- where Length <> @Loop

    SET @Loop = @Loop + 1
 END

--  Show results
SELECT *
 from Results
 where Length = @MaxLoop
 order by Combo

总体思路是:当向任何字符串(例如“A”)添加新元素(例如“B”)时,要捕获所有排列,您需要添加 B 到所有可能的位置(Ba,aB),从而产生一组新的字符串。然后迭代:向字符串中的每个位置添加一个新元素 (C) (AB 变为 Cab、aCb、abC),对于所有字符串(Cba、bCa、baC),您就有了一组排列。迭代每个结果集 下一个角色,直到你用完角色......或资源。 10 个元素是 360 万种排列,使用上述算法大约为 48MB,14 个(唯一)元素将达到 870 亿种排列和 1.163 TB。

我确信它最终可能会被嵌入到 CTE 中,但最终这只是一个美化的循环。逻辑 这样就更清晰了,我不禁认为 CTE 执行计划将是一场噩梦。


9
投票
DECLARE @s VARCHAR(5);
SET @s = 'ABCDE';

WITH Subsets AS (
SELECT CAST(SUBSTRING(@s, Number, 1) AS VARCHAR(5)) AS Token,
CAST('.'+CAST(Number AS CHAR(1))+'.' AS VARCHAR(11)) AS Permutation,
CAST(1 AS INT) AS Iteration
FROM dbo.Numbers WHERE Number BETWEEN 1 AND 5
UNION ALL
SELECT CAST(Token+SUBSTRING(@s, Number, 1) AS VARCHAR(5)) AS Token,
CAST(Permutation+CAST(Number AS CHAR(1))+'.' AS VARCHAR(11)) AS
Permutation,
s.Iteration + 1 AS Iteration
FROM Subsets s JOIN dbo.Numbers n ON s.Permutation NOT LIKE
'%.'+CAST(Number AS CHAR(1))+'.%' AND s.Iteration < 5 AND Number
BETWEEN 1 AND 5
--AND s.Iteration = (SELECT MAX(Iteration) FROM Subsets)
)
SELECT * FROM Subsets
WHERE Iteration = 5
ORDER BY Permutation

Token Permutation Iteration
----- ----------- -----------
ABCDE .1.2.3.4.5. 5
ABCED .1.2.3.5.4. 5
ABDCE .1.2.4.3.5. 5
(snip)
EDBCA .5.4.2.3.1. 5
EDCAB .5.4.3.1.2. 5
EDCBA .5.4.3.2.1. 5

不久前首次发布这里

但是,最好使用更好的语言,例如 C# 或 C++。


9
投票

只需使用 SQL,无需任何代码,如果您可以将另一列插入表中,您就可以做到这一点。显然,您需要为每个要排列的值创建一个连接表。

with llb as (
  select 'A' as col,1 as cnt union 
  select 'B' as col,3 as cnt union 
  select 'C' as col,9 as cnt union 
  select 'D' as col,27 as cnt
) 
select a1.col,a2.col,a3.col,a4.col
from llb a1
cross join llb a2
cross join llb a3
cross join llb a4
where a1.cnt + a2.cnt + a3.cnt + a4.cnt = 40

2
投票

我是否正确理解您构建了笛卡尔积 n x n x n x n,然后过滤掉不需要的东西?另一种方法是生成直到 n! 的所有数字!然后使用阶乘系统通过元素编码来映射它们。


1
投票

比递归 CTE 更简单:

declare @Number Table( Element varchar(MAX), Id varchar(MAX) )
Insert Into @Number Values ( 'A', '01')
Insert Into @Number Values ( 'B', '02')
Insert Into @Number Values ( 'C', '03')
Insert Into @Number Values ( 'D', '04')

select a.Element, b.Element, c.Element, d.Element
from @Number a
join @Number b on b.Element not in (a.Element)
join @Number c on c.Element not in (a.Element, b.Element)
join @Number d on d.Element not in (a.Element, b.Element, c.Element)
order by 1, 2, 3, 4

对于任意数量的元素,将其编写为脚本:

if object_id('tempdb..#number') is not null drop table #number
create table #number (Element char(1), Id int, Alias as '_'+convert(varchar,Id))
insert #number values ('A', 1)
insert #number values ('B', 2)
insert #number values ('C', 3)
insert #number values ('D', 4)
insert #number values ('E', 5)

declare @sql nvarchar(max)
set @sql = '
select '+stuff((
  select char(13)+char(10)+'+'+Alias+'.Element'
  from #number order by Id for xml path (''), type
  ).value('.','NVARCHAR(MAX)'),3,1,' ')

set @sql += '
from #number '+(select top 1 Alias from #number order by Id)

set @sql += (
  select char(13)+char(10)+'join #number '+Alias+' on '+Alias+'.Id not in ('
    +stuff((
      select ', '+Alias+'.Id'
      from #number b where a.Id > b.Id
      order by Id for xml path ('')
      ),1,2,'')
    + ')'
  from #number a where Id > (select min(Id) from #number)
  order by Element for xml path (''), type
  ).value('.','NVARCHAR(MAX)')

set @sql += '
order by 1'

print @sql
exec (@sql)

生成此:

select 
 _1.Element
+_2.Element
+_3.Element
+_4.Element
+_5.Element
from #number _1
join #number _2 on _2.Id not in (_1.Id)
join #number _3 on _3.Id not in (_1.Id, _2.Id)
join #number _4 on _4.Id not in (_1.Id, _2.Id, _3.Id)
join #number _5 on _5.Id not in (_1.Id, _2.Id, _3.Id, _4.Id)
order by 1

1
投票

此方法使用二进制掩码来选择正确的行:

;with src(t,n,p) as (
select element, index, power(2,index-1)
from table
)
select s1.t+s2.t+s3.t+s4.t
from src s1, src s2, src s3, src s4
where s1.p+s2.p+s3.p+s4.p=power(2,4)-1

我的原帖:

declare @t varchar(4) = 'ABCD'

;with src(t,n,p) as (
select substring(@t,1,1),1,power(2,0)
union all
select substring(@t,n+1,1),n+1,power(2,n)
from src
where n < len(@t)
)
select s1.t+s2.t+s3.t+s4.t
from src s1, src s2, src s3, src s4
where s1.p+s2.p+s3.p+s4.p=power(2,len(@t))-1

这是困扰您的问题之一。我喜欢我原来答案的简单性,但有一个问题,我仍在构建所有可能的解决方案,然后选择正确的解决方案。 另一种尝试是通过仅构建正确的解决方案来提高此过程的效率,得到了这个答案。 仅当字符串中不存在该字符时才将该字符添加到字符串中。 Patindex 似乎是 CTE 解决方案的完美伴侣。 在这里。

declare @t varchar(10) = 'ABCDEFGHIJ'

;with s(t,n) as (
select substring(@t,1,1),1
union all
select substring(@t,n+1,1),n+1
from s where n<len(@t)
)
,j(t) as (
select cast(t as varchar(10)) from s
union all
select cast(j.t+s.t as varchar(10))
from j,s where patindex('%'+s.t+'%',j.t)=0
)
select t from j where len(t)=len(@t)

我能够在 3 分 2 秒内构建全部 360 万个解决方案。希望这个解决方案不会因为它不是第一个而被错过。


0
投票

当前使用递归 CTE 的解决方案。

-- The base elements
Declare @Number Table( Element varchar(MAX), Id varchar(MAX) )
Insert Into @Number Values ( 'A', '01')
Insert Into @Number Values ( 'B', '02')
Insert Into @Number Values ( 'C', '03')
Insert Into @Number Values ( 'D', '04')

-- Number of elements
Declare @ElementsNumber int
Select  @ElementsNumber = COUNT(*)
From    @Number;



-- Permute!
With Permutations(   Permutation,   -- The permutation generated
                     Ids,            -- Which elements where used in the permutation
                     Depth )         -- The permutation length
As
(
    Select  Element,
            Id + ';',
            Depth = 1
    From    @Number
    Union All
    Select  Permutation + ' ' + Element,
            Ids + Id + ';',
            Depth = Depth + 1
    From    Permutations,
            @Number
    Where   Depth < @ElementsNumber And -- Generate only the required permutation number
            Ids Not like '%' + Id + ';%' -- Do not repeat elements in the permutation (this is the reason why we need the 'Ids' column) 
)
Select  Permutation
From    Permutations
Where   Depth = @ElementsNumber

0
投票

假设您的表名为 Elements 并且有 4 行,这很简单:

select e1.Element + e2.Element + e3.Element + e4.Element
from Elements e1
    join Elements e2 on e2.Element != e1.Element 
    join Elements e3 on e3.Element != e2.Element AND e3.Element != e1.Element 
    join Elements e4 on e4.Element != e3.Element AND e4.Element != e2.Element AND e4.Element != e1.Element

0
投票

我的 SQL 技能太生锈了,但我对类似的问题采取了不同的策略,并认为值得分享。

Table1 - X strings in a single field Uno
Table2 - Y strings in a single field Dos

(SELECT Uno, Dos
FROM Table1
CROSS JOIN Table2 ON 1=1)
    UNION
(SELECT  Dos, Uno
FROM Table1
CROSS JOIN Table2 ON 1=1)

添加 CROSS JOIN 的 3 个表的原理相同

(SELECT  Tres, Uno, Dos
FROM Table1
CROSS JOIN Table2 ON 1=1
    CROSS JOIN Table3 ON 1=1)

虽然联合体中需要6个交叉连接集。


0
投票

--希望这是一个快速的解决方案,只需更改 #X 中的值

IF OBJECT_ID('tempdb.dbo.#X', 'U') IS NOT NULL  DROP TABLE #X; CREATE table #X([Opt] [nvarchar](10) NOT NULL)
Insert into #X values('a'),('b'),('c'),('d')
declare @pSQL NVarChar(max)='select * from #X X1 ', @pN int =(select count(*) from #X), @pC int = 0;
while @pC<@pN begin
if @pC>0 set  @pSQL = concat(@pSQL,' cross join #X X', @pC+1);
set @pC = @pC +1;
end
execute(@pSQL)

--或作为单列结果

IF OBJECT_ID('tempdb.dbo.#X', 'U') IS NOT NULL  DROP TABLE #X; CREATE table #X([Opt] [nvarchar](10) NOT NULL)
Insert into #X values('a'),('b'),('c'),('d')
declare @pSQL NVarChar(max)=' as R from #X X1 ',@pSelect NVarChar(Max)=' ',@pJoin NVarChar(Max)='', @pN int =(select count(*) from #X), @pC int = 0;
while @pC<@pN begin
if @pC>0 set  @pJoin = concat(@pJoin ,' cross join #X X', @pC+1) set @pSelect =  concat(@pSelect ,'+ X', @pC+1,'.Opt ')
set @pC = @pC +1;
end
set @pSQL = concat ('select X1.Opt', @pSelect,@pSQL ,@pJoin)
exec(@pSQL)

0
投票
create function GeneratePermutations (@string nvarchar(4000))
RETURNS @Permutations 
TABLE(
    name nVARCHAR(500)
  )
AS
begin
     declare @SplitedString table(name nvarchar(500))
 
     insert into @SplitedString
     select *
     from string_split(@string,' ')


    declare @CountOfWords as int

    set @CountOfWords = (select count(*) from @SplitedString) 

    ;with cte_Permutations (name, level) as (
       select convert(nvarchar(500), name), 1 as level from @SplitedString
       union all
       select convert(nvarchar(500),splited.name+','+cte_Permutations.name),level+1 
       from @SplitedString splited ,cte_Permutations
       where level < @CountOfWords
    )
    insert into @Permutations
    select  name
    from cte_Permutations
    where level = @CountOfWords
    order by name

    return  
end


select * 
From (
        select 1 id,'a b c' msg
        union all 
        select 2 id,'d e' msg
    ) p  
    cross apply dbo.GeneratePermutations(p.msg)

0
投票

未显示 2 个相同的项目。我所说的项目是指它包含“索引,元素”的独特组合,因此“1,A”与“2,A”不同。我的结果返回 24 行,其中有 4 个唯一元素。

为了回答最初的问题,我在下面的示例中使用了 1、A/2、B/3、C/4、D,但如果索引不同,则如果 2 个元素相同,则它会起作用。

提供的元素不包含公共 (,) 或正斜杠 (/) 应该没问题,我已使用 varchar(max) 进行 CTE 转换。

注意:i 是索引,X 是元素(可以是任意长度),H 是层次结构,以“/”作为分隔符。

;WITH CTE AS (
    SELECT t.i, X = CONVERT(varchar(MAX), t.X), H = CONVERT(VARCHAR(MAX), CONVERT(VARCHAR, i) + ',' + t.X), COUNT(1) OVER () AS N
    FROM (VALUES(1,'A'),(2,'B'),(3,'C'),(4,'D')) t(i,X)
), CTER AS (
    SELECT a.i, a.X, a.H, Level = 1, a.N
    FROM CTE a

    UNION ALL

    SELECT b.i, CONVERT(VARCHAR(MAX), a.X + b.X), CONVERT(VARCHAR(MAX), a.H + '/' + b.H), Level + 1, a.N
    FROM CTER a
        JOIN CTE b ON '/' + a.H + '/' NOT LIKE '%/' + b.H + '/%'
    WHERE a.Level < a.N
)
SELECT X
FROM CTER
WHERE Level = N
ORDER BY X
© www.soinside.com 2019 - 2024. All rights reserved.