我有一种情况,我生成了一些模板嵌套列表,其中包含n个有组织的元素,其中模板中的每个数字对应于n个值的平面列表中的索引:
S =[[[2,4],[0,3]], [[1,5],[6,7]],[[10,9],[8,11],[13,12]]]
对于这些模板中的每一个,它们内部的值对应于平面列表中的索引值,如下所示:
A = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n"]
要得到;
B = [[["c","e"],["a","d"]], [["b","f"],["g","h"]],[["k","j"],["i","l"],["n","m"]]]
我如何使用列表A中的值填充结构S以获得B,考虑到: - 列表A的值可以更改值而不是数字 - 模板可以具有任何深度的嵌套结构但仅使用来自A的索引一次如上所示。
如果模板的深度不超过3个级别,我使用下面非常难看的附加unflatten函数执行此操作。是否有更好的方法使用生成器完成它,产量因此它适用于任何任意深度的模板。
我认为但无法实现的另一个解决方案是将模板设置为带有生成变量的字符串,然后使用eval()为变量分配新值
def unflatten(item, template):
# works up to 3 levels of nested lists
tree = []
for el in template:
if isinstance(el, collections.Iterable) and not isinstance(el, str):
tree.append([])
for j, el2 in enumerate(el):
if isinstance(el2, collections.Iterable) and not isinstance(el2, str):
tree[-1].append([])
for k, el3 in enumerate(el2):
if isinstance(el3, collections.Iterable) and not isinstance(el3, str):
tree[-1][-1].append([])
else:
tree[-1][-1].append(item[el3])
else:
tree[-1].append(item[el2])
else:
tree.append(item[el])
return tree
我需要一个更好的解决方案,可以在递归地执行上述操作以及n = 100的有组织元素时实现此目的。
我正在使用的计时功能是这个:
def timethis(func):
'''
Decorator that reports the execution time.
'''
@wraps(func)
def wrapper(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
end = time.time()
print(func.__name__, end-start)
return result
return wrapper
我正在将@DocDrivin建议的功能包装在另一个内部,用一个单行调用它。下面是我丑陋的追加功能。
@timethis
def unflatten(A, S):
for i in range(100000):
# making sure that you don't modify S
rebuilt_list = copy.deepcopy(S)
# create the mapping dict
adict = {key: val for key, val in enumerate(A)}
# the recursive worker function
def worker(alist):
for idx, entry in enumerate(alist):
if isinstance(entry, list):
worker(entry)
else:
# might be a good idea to catch key errors here
alist[idx] = adict[entry]
#build list
worker(rebuilt_list)
return rebuilt_list
@timethis
def unflatten2(A, S):
for i in range (100000):
#up to level 3
temp_tree = []
for i, el in enumerate(S):
if isinstance(el, collections.Iterable) and not isinstance(el, str):
temp_tree.append([])
for j, el2 in enumerate(el):
if isinstance(el2, collections.Iterable) and not isinstance(el2, str):
temp_tree[-1].append([])
for k, el3 in enumerate(el2):
if isinstance(el3, collections.Iterable) and not isinstance(el3, str):
temp_tree[-1][-1].append([])
else:
temp_tree[-1][-1].append(A[el3])
else:
temp_tree[-1].append(A[el2])
else:
temp_tree.append(A[el])
return temp_tree
递归方法是更好的语法,但是,它比使用append方法慢得多。
A = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n"]
S = [[[2,4],[0,3]], [[1,5],[6,7]],[[10,9],[8,11],[13,12]]]
A = {k: v for k, v in enumerate(A)}
def worker(alist):
for e in alist:
if isinstance(e, list):
yield list(worker(e))
else:
yield A[e]
def do(alist):
return list(worker(alist))
这也是一种递归方法,只是避免单个项目分配,让list
通过从生成器中读取“热CPU”值来完成工作。如果你愿意,你可以从@DocDriven的答案中复制Try it online!-- setup1
和setup2
(但我建议你不要夸大数字,如果你想玩的话就在本地做)。
以下是示例时间数字:
My result: [0.11194685893133283, 0.11086182110011578, 0.11299032904207706]
result1: [1.0810202199500054, 1.046933784848079, 0.9381260159425437]
result2: [0.23467918601818383, 0.236218704842031, 0.22498539905063808]
您可以使用递归来完成此操作:
import copy
S =[[[2,4],[0,3]], [[1,5],[6,7]],[[10,9],[8,11],[13,12]]]
A = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n"]
# making sure that you don't modify S
B = copy.deepcopy(S)
# create the mapping dict
adict = {key: val for key, val in enumerate(A)}
# the recursive worker function
def worker(alist):
for idx, entry in enumerate(alist):
if isinstance(entry, list):
worker(entry)
else:
# might be a good idea to catch key errors here
alist[idx] = adict[entry]
worker(B)
print(B)
这为B产生以下输出:
[[['c', 'e'], ['a', 'd']], [['b', 'f'], ['g', 'h']], [['k', 'j'], ['i', 'l'], ['n', 'm']]]
我没有检查列表条目是否实际上可以与dict映射,因此您可能想要添加一个检查(在代码中标记了该点)。
小编辑:只是看到你想要的输出(可能)有一个错字。索引3映射到“d”,而不是“c”。您可能想要编辑它。
大编辑:为了证明我的提案不像乍一看似乎是灾难性的,我决定包含一些代码来测试它的运行时。看一下这个:
import timeit
setup1 = '''
import copy
S =[[[2,4],[0,3]], [[1,5],[6,7]],[[10,9],[8,11],[13,12]]]
A = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n"]
adict = {key: val for key, val in enumerate(A)}
# the recursive worker function
def worker(olist):
alist = copy.deepcopy(olist)
for idx, entry in enumerate(alist):
if isinstance(entry, list):
worker(entry)
else:
alist[idx] = adict[entry]
return alist
'''
code1 = '''
worker(S)
'''
setup2 = '''
import collections
S =[[[2,4],[0,3]], [[1,5],[6,7]],[[10,9],[8,11],[13,12]]]
A = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n"]
def unflatten2(A, S):
#up to level 3
temp_tree = []
for i, el in enumerate(S):
if isinstance(el, collections.Iterable) and not isinstance(el, str):
temp_tree.append([])
for j, el2 in enumerate(el):
if isinstance(el2, collections.Iterable) and not isinstance(el2, str):
temp_tree[-1].append([])
for k, el3 in enumerate(el2):
if isinstance(el3, collections.Iterable) and not isinstance(el3, str):
temp_tree[-1][-1].append([])
else:
temp_tree[-1][-1].append(A[el3])
else:
temp_tree[-1].append(A[el2])
else:
temp_tree.append(A[el])
return temp_tree
'''
code2 = '''
unflatten2(A, S)
'''
print(f'Recursive func: { [i/10000 for i in timeit.repeat(setup = setup1, stmt = code1, repeat = 3, number = 10000)] }')
print(f'Original func: { [i/10000 for i in timeit.repeat(setup = setup2, stmt = code2, repeat = 3, number = 10000)] }')
我正在使用timeit模块进行测试。运行此代码段时,您将获得与此类似的输出:
Recursive func: [8.74395573977381e-05, 7.868373290111777e-05, 7.9051584698027e-05]
Original func: [3.548609419958666e-05, 3.537480780214537e-05, 3.501355930056888e-05]
这些是10000次迭代的平均次数,我决定运行3次以显示波动。正如您所看到的,在这种特殊情况下,我的功能比原始功能慢2.22到2.50倍,但仍然可以接受。减速可能是由于使用deepcopy
。
你的测试有一些缺陷,例如您在每次迭代时重新定义映射字典。你不会这样做,而是在定义一次后将它作为函数的参数给出。