我正在尝试获得 2 个无限生成器的乘积,但是
product
中的 itertools
函数不允许这种 行为。
行为示例:
from itertools import *
i = count(1)
j = count(1)
x = product(i, j)
[Killed]
我想要什么:
x = product(i, j)
((0,0), (0,1), (1,0), (1,1) ...)
组合返回的顺序并不重要,只要给定无限时间,最终都会生成所有组合。这意味着给定元素组合,返回的生成器中必须有一个有限索引。
pip install infinite
:
from itertools import count
from infinite import product
for x, y in product(count(0), count(0)):
print(x, y)
if (x, y) == (3, 3):
break
(a0, b1), (a0, b2), (a0, b3), ... (a0, bn), ...
因此,您可以从第一个生成器中取出第一个元素,然后循环第二个。如果你真的想这样做,你需要一个嵌套循环,并且你需要用
tee
复制嵌套生成器,否则你将无法第二次循环它(即使没关系)因为你永远不会耗尽生成器,所以你永远不需要循环). 但是如果你真的真的很想这么做,这里就有:
from itertools import tee
def product(gen1, gen2):
for elem1 in gen1:
gen2, gen2_copy = tee(gen2)
for elem2 in gen2_copy:
yield (elem1, elem2)
我们的想法是始终制作 gen2
的单个副本。首先尝试使用有限生成器。
print(list(product(range(3), range(3,5))))
[(0, 3), (0, 4), (1, 3), (1, 4), (2, 3), (2, 4)]
然后是无限的:
print(next(product(count(1), count(1))))
(1, 1)
为了做到这一点,您需要缓存以前的值,因此我创建了一个类
GenCacher
来缓存以前提取的值:
class GenCacher:
def __init__(self, generator):
self._g = generator
self._cache = []
def __getitem__(self, idx):
while len(self._cache) <= idx:
self._cache.append(next(self._g))
return self._cache[idx]
之后你只需要实现zig-zag算法即可:
def product(gen1, gen2):
gc1 = GenCacher(gen1)
gc2 = GenCacher(gen2)
idx1 = idx2 = 0
moving_up = True
while True:
yield (gc1[idx1], gc2[idx2])
if moving_up and idx1 == 0:
idx2 += 1
moving_up = False
elif not moving_up and idx2 == 0:
idx1 += 1
moving_up = True
elif moving_up:
idx1, idx2 = idx1 - 1, idx2 + 1
else:
idx1, idx2 = idx1 + 1, idx2 - 1
from itertools import count
for x, y in product(count(0), count(0)):
print(x, y)
if x == 2 and y == 2:
break
这会产生以下输出:
0 0
0 1
1 0
2 0
1 1
0 2
0 3
1 2
2 1
3 0
4 0
3 1
2 2
(0,0)
(索引之和)的距离。
(0,0)
是唯一一个距离为 0 的,
(1,0)
和
(0,1)
距离为 1,等等
GenCacher
类,但代码变成:
def summations(sumTo, n=2):
if n == 1:
yield (sumTo,)
else:
for head in xrange(sumTo + 1):
for tail in summations(sumTo - head, n - 1):
yield (head,) + tail
def product(*gens):
gens = map(GenCacher, gens)
for dist in count(0):
for idxs in summations(dist, len(gens)):
yield tuple(gen[idx] for gen, idx in zip(gens, idxs))
def product(a, b):
a, a_copy = itertools.tee(a, 2)
b, b_copy = itertools.tee(b, 2)
yield (next(a_copy), next(b_copy))
size = 1
while 1:
next_a = next(a_copy)
next_b = next(b_copy)
a, new_a = itertools.tee(a, 2)
b, new_b = itertools.tee(b, 2)
yield from ((next(new_a), next_b) for i in range(size))
yield from ((next_a, next(new_b)) for i in range(size))
yield (next_a, next_b)
size += 1
带有
itertools.tee
的自制解决方案。这会使用大量内存,因为中间状态存储在
tee
中
这有效地返回了不断扩大的正方形的边:
0 1 4 9
2 3 5 a
6 7 8 b
c d e f
给定无限的时间和无限的内存,此实现
应该返回所有可能的产品。
所以这样的事情可能会起作用:
def product(i, j):
"""Generate Cartesian product i x j; potentially uses a lot of memory."""
earlier_values_i = []
earlier_values_j = []
# If either of these fails, that sequence is empty, and so is the
# expected result. So it is correct that StopIteration is raised,
# no need to do anything.
next_i = next(i)
next_j = next(j)
found_i = found_j = True
while True:
if found_i and found_j:
yield (next_i, next_j)
elif not found_i and not found_j:
break # Both sequences empty
if found_i:
for jj in earlier_values_j:
yield (next_i, jj)
if found_j:
for ii in earlier_values_i:
yield (ii, next_j)
if found_i:
earlier_values_i.append(next_i)
if found_j:
earlier_values_j.append(next_j)
try:
next_i = next(i)
found_i = True
except StopIteration:
found_i = False
try:
next_j = next(j)
found_j = True
except StopIteration:
found_j = False
这在我的脑海中是如此简单,但在打印出来后看起来非常复杂,一定有一些更简单的方法。但我认为它会起作用。
该算法的复杂度为
O(n+m)
,其中
mn
是正在生成的集合的基数(在无限情况下,需要
O(n+m)
内存来生成第 (
mn
) 个元素)。
def infinite_card_prod(i, j):
i_reversed = []
j_forward = []
n = 0
while True:
n += 1
i_reversed.append(next(i))
j_forward.append(next(j))
for k in range(n):
yield (i_reversed[n-1-k], j_forward[k])
from itertools import *
i = count(1)
j = count(1)
for e in ((x, y) for x in i for y in j):
yield r
或者在python3中:
yield from ((x, y) for x in i for y in j)