无限生成器的Python乘积

Question

我正在尝试获得 2 个无限生成器的乘积，但是

product

中的

itertools

函数不允许这种行为。

行为示例：

from itertools import *
i = count(1)
j = count(1)
x = product(i, j)

[Killed]

我想要什么：

x = product(i, j)

((0,0), (0,1), (1,0), (1,1) ...)

组合返回的顺序并不重要，只要给定无限时间，最终都会生成所有组合。这意味着给定元素组合，返回的生成器中必须有一个有限索引。

Answer 1

tl；博士

下面提供的代码现已包含在 PyPI 上的包

infinite

中。所以现在你实际上可以在运行之前先pip install infinite

：

from itertools import count
from infinite import product

for x, y in product(count(0), count(0)):
    print(x, y)
    if (x, y) == (3, 3):
        break

懒惰的解决方案

如果你不关心顺序，因为生成器是无限的，有效的输出将是：

(a0, b1), (a0, b2), (a0, b3), ... (a0, bn), ...

因此，您可以从第一个生成器中取出第一个元素，然后循环第二个。

如果你真的想这样做，你需要一个嵌套循环，并且你需要用

tee

复制嵌套生成器，否则你将无法第二次循环它

（即使没关系）因为你永远不会耗尽生成器，所以你永远不需要循环）.

但是如果你真的真的很想这么做，这里就有：

from itertools import tee

def product(gen1, gen2):
    for elem1 in gen1:
        gen2, gen2_copy = tee(gen2)
        for elem2 in gen2_copy:
            yield (elem1, elem2)

我们的想法是始终制作

gen2

 的单个副本。首先尝试使用有限生成器。

print(list(product(range(3), range(3,5))))
[(0, 3), (0, 4), (1, 3), (1, 4), (2, 3), (2, 4)]

然后是无限的：

print(next(product(count(1), count(1))))
(1, 1)

之字形算法

正如其他人在评论中指出的那样（以及之前的解决方案中所述），这不会产生所有组合。然而，自然数和数字对之间的映射是存在的，因此必须可以以不同的方式迭代这些对，以便可以在有限的时间内找到特定的对（没有无限数），您需要 zig-锯齿扫描算法。

为了做到这一点，您需要缓存以前的值，因此我创建了一个类

GenCacher

 来缓存以前提取的值：

class GenCacher:
    def __init__(self, generator):
        self._g = generator
        self._cache = []

    def __getitem__(self, idx):
        while len(self._cache) <= idx:
            self._cache.append(next(self._g))
        return self._cache[idx]

之后你只需要实现zig-zag算法即可：

def product(gen1, gen2):
    gc1 = GenCacher(gen1)
    gc2 = GenCacher(gen2)
    idx1 = idx2 = 0
    moving_up = True

    while True:
        yield (gc1[idx1], gc2[idx2])

        if moving_up and idx1 == 0:
            idx2 += 1
            moving_up = False
        elif not moving_up and idx2 == 0:
            idx1 += 1
            moving_up = True
        elif moving_up:
            idx1, idx2 = idx1 - 1, idx2 + 1
        else:
            idx1, idx2 = idx1 + 1, idx2 - 1

示例

from itertools import count

for x, y in product(count(0), count(0)):
    print(x, y)
    if x == 2 and y == 2:
        break

这会产生以下输出：

将解决方案扩展到 2 个以上的发电机

我们可以对解决方案进行一些编辑，使其甚至适用于多个生成器。基本思想是：

迭代距

(0,0)

（索引之和）的距离。

(0,0)

 是唯一一个距离为 0 的，

(1,0)

和

(0,1)

 距离为 1，等等

生成该距离的所有索引元组
提取对应元素

我们仍然需要

GenCacher

 类，但代码变成：

def summations(sumTo, n=2):
    if n == 1:
        yield (sumTo,)
    else:
        for head in xrange(sumTo + 1):
            for tail in summations(sumTo - head, n - 1):
                yield (head,) + tail

def product(*gens):
    gens = map(GenCacher, gens)

    for dist in count(0):
        for idxs in summations(dist, len(gens)):
            yield tuple(gen[idx] for gen, idx in zip(gens, idxs))

Answer 2

 def product(a, b):
     a, a_copy = itertools.tee(a, 2)
     b, b_copy = itertools.tee(b, 2)
     yield (next(a_copy), next(b_copy))
     size = 1
     while 1:
         next_a = next(a_copy)
         next_b = next(b_copy)
         a, new_a = itertools.tee(a, 2)
         b, new_b = itertools.tee(b, 2)
         yield from ((next(new_a), next_b) for i in range(size))
         yield from ((next_a, next(new_b)) for i in range(size))
         yield (next_a, next_b)
         size += 1

带有

itertools.tee

 的自制解决方案。这会使用大量内存，因为中间状态存储在

tee

中

这有效地返回了不断扩大的正方形的边：

给定无限的时间和无限的内存，此实现

应该返回所有可能的产品。

Answer 3

无论你怎么做，内存都会增加很多，因为每个迭代器的每个值在第一次之后都会出现无限次，所以它必须保存在一些不断增长的变量中。

所以这样的事情可能会起作用：

def product(i, j):
    """Generate Cartesian product i x j; potentially uses a lot of memory."""
    earlier_values_i = []
    earlier_values_j = []

    # If either of these fails, that sequence is empty, and so is the
    # expected result. So it is correct that StopIteration is raised,
    # no need to do anything.
    next_i = next(i)
    next_j = next(j)
    found_i = found_j = True

    while True:
        if found_i and found_j:
            yield (next_i, next_j)
        elif not found_i and not found_j:
            break  # Both sequences empty

        if found_i: 
            for jj in earlier_values_j:
                yield (next_i, jj)
        if found_j:
            for ii in earlier_values_i:
                yield (ii, next_j)

        if found_i:
            earlier_values_i.append(next_i)
        if found_j:
            earlier_values_j.append(next_j)

        try:
            next_i = next(i)
            found_i = True
        except StopIteration:
            found_i = False

        try:
            next_j = next(j)
            found_j = True
        except StopIteration:
            found_j = False

这在我的脑海中是如此简单，但在打印出来后看起来非常复杂，一定有一些更简单的方法。但我认为它会起作用。

Answer 4

这是我的解决方案。它使用修改后的 zigzag 算法，其中 (x, y) 之后的下一个元素是 (x-1, y+1)，但对角线末端除外。

该算法的复杂度为

O(n+m)

，其中

mn

 是正在生成的集合的基数（在无限情况下，需要

O(n+m)

 内存来生成第 (

mn

) 个元素）。

def infinite_card_prod(i, j):
    i_reversed = []
    j_forward = []
    n = 0
    while True:
        n += 1
        i_reversed.append(next(i))
        j_forward.append(next(j))
        for k in range(n):
            yield (i_reversed[n-1-k], j_forward[k])

Answer 5

您可以简单地使用生成器表达式：

from itertools import *
i = count(1)
j = count(1)

for e in ((x, y) for x in i for y in j):
    yield r

或者在python3中：

yield from ((x, y) for x in i for y in j)

无限生成器的Python乘积

问题描述投票：0回答：5

5个回答

最新问题

无限生成器的Python乘积

问题描述 投票：0回答：5

5个回答

最新问题

问题描述投票：0回答：5