如何优化递归函数来查找所有排列？

Question

我编写了以下代码来查找给定向量的所有排列：

perm <- function(v, r = NULL, P = NULL) {
  l <- length(v)
  if (l == 0) {
    P <- rbind(P, r)
    rownames(P) <- NULL
    P
  } else {
    for (i in 1:l) {
      new_r <- c(r, v[i])
      new_v <- v[-i]
      P <- perm(new_v, new_r, P)
    }
    P
  }
}

P <- perm(1:9) # takes "forever" yet e.g. perm(1:7) is quite fast!?!
P

它做了它应该做的事情，但问题是，如果使用长度> 8的向量（如上所述），它会永远运行。

我的问题我没有真正看到问题，我发现一些递归实现看起来不那么不同但效率更高......那么是否有一种简单的方法来优化代码以便它运行得更快？

Answer 1

正如@akrun所说，R的递归通常效率不高。但是，如果您必须有一个递归解决方案，那么请查看gtools::permutations。这是实施：

permGtools <- function(n, r, v) {
    if (r == 1) 
        matrix(v, n, 1)
    else if (n == 1) 
        matrix(v, 1, r)
    else {
        X <- NULL
        for (i in 1:n) X <- rbind(X, cbind(v[i], permGtools(n - 1, r - 1, v[-i])))
        X
    }
}

顺便说一下，要获取完整的源代码，只需在控制台中键入gtools::permutations并按Enter键即可。有关更多信息，请参阅How can I view the source code for a function?

以下是一些时间安排：

system.time(perm(1:8))
  user  system elapsed 
34.074  10.641  44.815

system.time(permGtools(8,8,1:8))
 user  system elapsed 
0.253   0.001   0.255

只是为了好的措施：

system.time(permGtools(9, 9, 1:9))
 user  system elapsed 
2.512   0.046   2.567

为什么OP的实现更慢？

如果您不阅读详细信息，请跳至摘要。

对于初学者，我们可以简单地看到OP的实现比gtools中的实现更多的递归调用。为了说明这一点，我们将count <<- count + 1L添加到每个函数的顶部（N.B.我们使用的是<<-赋值运算符，它首先搜索父环境）。例如：

permGtoolsCount <- function(n, r, v) {
    count <<- count + 1L
    if (r == 1)
    .
    .

现在我们测试一些长度：

iterationsOP <- sapply(4:7, function(x) {
    count <<- 0L
    temp <- permCount(1:x)
    count
})

iterationsOP
[1]    65   326  1957 13700

iterationsGtools <- sapply(4:7, function(x) {
    count <<- 0L
    temp <- permGtoolsCount(x, x, 1:x)
    count
})

iterationsGtools
[1]   41  206 1237 8660

如您所见，OP的实现在每种情况下都会进行更多调用。事实上，它使1.58...倍于递归调用的数量。

iterationsOP / iterationsGtools
[1] 1.585366 1.582524 1.582053 1.581986

正如我们已经说过的那样，R的递归声誉不佳。除了R does not employ tail-recursion之外，我找不到任何确切原因。

在这一点上，似乎很难相信大约1.58倍的递归调用可以解释我们在上面看到的175倍的速度（即44.815 / 0.255 ~= 175）。

我们可以使用Rprof对代码进行分析，以便收集更多信息：

Rprof("perm.out", memory.profiling = TRUE)
a1 <- perm(1:8)
Rprof(NULL)
summaryRprof("perm.out", memory = "both")$by.total
             total.time total.pct mem.total self.time self.pct
"perm"            43.42    100.00   15172.1      0.58     1.34
"rbind"           22.50     51.82    7513.7     22.50    51.82
"rownames<-"      20.32     46.80    7388.7     20.30    46.75
"c"                0.02      0.05      23.7      0.02     0.05
"length"           0.02      0.05       0.0      0.02     0.05



Rprof("permGtools.out", memory.profiling = TRUE)
a2 <- permGtools(8, 8, 1:8)
Rprof(NULL)
summaryRprof("permGtools.out", memory = "tseries")$by.total
             total.time total.pct mem.total self.time self.pct
"rbind"            0.34    100.00     134.8      0.18    52.94
"cbind"            0.34    100.00     134.8      0.08    23.53
"permGtools"       0.34    100.00     134.8      0.06    17.65
"matrix"           0.02      5.88       0.0      0.02     5.88

立即跳出的一件事（除了时间之外）是OP实现的大量内存使用。 OP的实现使用大约15 Gb的内存，而gtools实现仅使用134 Mb。

深层发掘

在上面，我们只是通过将memory参数设置为both来查看一般视图中的内存使用情况。还有另一个名为tseries的设置，可让您查看内存使用情况。

head(summaryRprof("perm.out", memory = "tseries"))
     vsize.small vsize.large    nodes duplications       stack:2
0.02     4050448    25558992 49908432         2048 "perm":"perm"
0.04       98808    15220400  1873760          780 "perm":"perm"
0.06       61832    12024184  1173256          489 "perm":"perm"
0.08       45400           0   861728          358 "perm":"perm"
0.1            0    14253568        0          495 "perm":"perm"
0.12       75752    21412320  1436120          599 "perm":"perm"

head(summaryRprof("permGtools.out", memory = "tseries"))
     vsize.small vsize.large    nodes duplications              stack:2
0.02     4685464    39860824 43891512            0 "permGtools":"rbind"
0.04      542080      552384 12520256            0 "permGtools":"rbind"
0.06           0           0        0            0 "permGtools":"rbind"
0.08      767992     1200864 17740912            0 "permGtools":"rbind"
0.1       500208      566592 11561312            0 "permGtools":"rbind"
0.12           0      151488        0            0 "permGtools":"rbind"

这里有很多事情，但要关注的是duplications领域。从summaryRprof的文档我们有：

它还记录在时间间隔内对内部函数重复的调用次数。当需要复制参数时，C代码调用duplicate。

比较每个实现中的副本数：

sum(summaryRprof("perm.out", memory = "tseries")$duplications)
[1] 121006

sum(summaryRprof("permGtools.out", memory = "tseries")$duplications)
[1] 0

所以我们看到OP的实现需要制作许多副本。我想这并不奇怪，因为所需的对象是函数原型中的参数。也就是说，P是要返回的排列矩阵，并且随着每次迭代不断变得越来越大。每次迭代，我们都将它传递给perm。你会注意到在gtools实现中，情况并非如此，因为它只是两个数值和一个参数的向量。

摘要

所以你有它，OP的原始实现不仅会产生更多的递归调用，而且还需要许多副本，这反过来又会使内存陷入困境，从而大幅降低效率。

Answer 2

使用permGeneral的RcppAlgos可能更好

P <- perm(1:5) # OP's function

library(RcppAlgos)
P1 <- permuteGeneral(5, 5)
all.equal(P, P1, check.attributes = FALSE)
#[1] TRUE

Benchmarks

稍微长一点的序列

system.time({

   P2 <- permuteGeneral(8, 8)
  })
#user  system elapsed 
#  0.001   0.000   0.001 

system.time({

   P20 <- perm(1:8) #OP's function
})
# user  system elapsed 
# 31.254  11.045  42.226 

all.equal(P2, P20, check.attributes = FALSE)
#[1] TRUE

通常，递归函数可能需要更长的时间，因为对函数的递归调用需要更多的执行时间

如何优化递归函数来查找所有排列？

问题描述投票：2回答：2

2个回答

为什么OP的实现更慢？

深层发掘

摘要

Benchmarks

最新问题

如何优化递归函数来查找所有排列？

问题描述 投票：2回答：2

2个回答

为什么OP的实现更慢？

深层发掘

摘要

Benchmarks

最新问题

问题描述投票：2回答：2