R - 使用重复项生成唯一的列表序列

Question

我希望在列表中生成唯一的元素序列，其中一些元素在R中不是唯一的

sequence <- c(1,0,1,0)

e.g：

result<-function(sequence)  
result:
  seq1 seq2 seq3 seq4 seq5 seq6
1    1    1    0    0    0    1
2    0    1    0    1    1    0
3    1    0    1    0    1    0
4    0    0    1    1    0    1

请注意，所有序列都包含原始序列中的每个元素，因此序列的总和始终为2

gtools返回“太少不同的元素”

result <- gtools::permutations(4, 4, coseq)

我没有找到任何直接解决这个问题的SO帖子，而是允许元素重复：Creating combination of sequences可以用expand.grid和不同长度的序列实现。

编辑：上面是一个最小的例子，理想情况下它将适用于序列：

 sequence = c(0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1)

有些重要的是，解决方案不会生成随后被删除的重复项，因为如果生成重复项，则较长的序列（例如20或30）将在计算上要求很高。

Answer 1

有一些专门为此构建的软件包。

首先是arrangements包：

## sequence is a bad name as it is a base R function so we use s instead
s <- c(1,0,1,0)
arrangements::permutations(unique(s), length(s), freq = table(s))
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    1    0    1    0
[3,]    1    0    0    1
[4,]    0    1    1    0
[5,]    0    1    0    1
[6,]    0    0    1    1

接下来，我们有RcppAlgos（我是作者）：

RcppAlgos::permuteGeneral(unique(s), length(s), freqs = table(s))
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    1    0    1    0
[3,]    1    0    0    1
[4,]    0    1    1    0
[5,]    0    1    0    1
[6,]    0    0    1    1

它们也非常有效。为了给你一个想法，根据OP的实际需要，其他方法将失败（我认为矩阵的行数有限制... 2 ^ 31 - 1，但不确定）或采取很长一段时间，因为他们必须在进行任何进一步处理之前生成16! ~= 2.092e+13排列。但是，使用这两个包，返回是即时的：

## actual example needed by OP
sBig <- c(0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1)

system.time(a <- arrangements::permutations(unique(sBig), length(sBig), freq = table(sBig)))
user  system elapsed 
0.001   0.001   0.002 

system.time(b <- RcppAlgos::permuteGeneral(unique(sBig), length(sBig), freqs = table(sBig)))
user  system elapsed 
0.001   0.001   0.002 

identical(a, b)
[1] TRUE

dim(a)
[1] 11440    16

Answer 2

m = apply(gtools::permutations(2, 4, 1:4, repeats.allowed = TRUE), 1, function(x) sequence[x])
m[,colSums(m) == 2]
#     [,1] [,2] [,3] [,4] [,5] [,6]
#[1,]    1    1    1    0    0    0
#[2,]    1    0    0    1    1    0
#[3,]    0    1    0    1    0    1
#[4,]    0    0    1    0    1    1

Answer 3

既然你提到了gtools::permutations，你可以这样做

首先生成所有排列

m <- apply(permutations(4, 4, 1:length(sequence)), 1, function(x) sequence[x])
#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#[1,]    1    1    1    1    1    1    0    0    0     0     0     0     1     1
#[2,]    0    0    1    1    0    0    1    1    1     1     0     0     1     1
#[3,]    1    0    0    0    0    1    1    0    1     0     1     1     0     0
#[4,]    0    1    0    0    1    0    0    1    0     1     1     1     0     0
#     [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24]
#[1,]     1     1     1     1     0     0     0     0     0     0
#[2,]     0     0     0     0     1     1     0     0     1     1
#[3,]     1     0     1     0     0     1     1     1     1     0
#[4,]     0     1     0     1     1     0     1     1     0     1

然后删除重复的列（从1和0的不可分辨性）

m[, !duplicated(apply(m, 2, paste, collapse = ""))]
#     [,1] [,2] [,3] [,4] [,5] [,6]
#[1,]    1    1    1    0    0    0
#[2,]    0    0    1    1    1    0
#[3,]    1    0    0    1    0    1
#[4,]    0    1    0    0    1    1

R - 使用重复项生成唯一的列表序列

问题描述投票：2回答：3

3个回答

最新问题

R - 使用重复项生成唯一的列表序列

问题描述 投票：2回答：3

3个回答

最新问题

问题描述投票：2回答：3