节省时间从Python中的嵌套列表中删除反向重复项？

Question

我有一个包含数百万个其他列表的嵌套（使用元组atm）。对于每个列表，元素只能包含一次。我认为每个列表都是唯一的，所以我需要它们，但我最近意识到我的嵌套列表包含这样的对：

listA = ('77', '131', '212', '69')
listB = ('69', '212', '131', '77')

虽然listA和listB是唯一的，但其中一个只是另一个的反转副本。我需要保留每一个独特的组合，因为订单很重要。

listC = ('131', '69', '77', '212')

因此，listC虽然使用相同的元素，但由于顺序而被认为是唯一的，需要保留。

如果我删除所有重复项，我可以将我的嵌套列表减少很多（大约一半），但我找不到以时间有效的方式执行此操作的方法。

因为最好在它们被添加到我的嵌套列表之前消除这些反向重复项，下面我已经包含了我用来制作列表的类。

class Graph(object):

    def __init__(self, graph_dict=None):
        """ Initializes a graph object.
            If no dictionary or None is given,
            an empty dictionary will be used. """
        if graph_dict == None:
            graph_dict = {}
        self.__graph_dict = graph_dict

    def find_all_paths(self, start_vertex, end_vertex, path=[]):
        """ Find all paths from start_vertex to end_vertex in graph """
        graph = self.__graph_dict
        path = path + [start_vertex]        
        if start_vertex == end_vertex:
            return [path]
        if start_vertex not in graph:
            return []
        paths = []
        for vertex in graph[start_vertex]:
            if vertex not in path:
                extended_paths = self.find_all_paths(vertex, end_vertex, path)
                for p in extended_paths:
                    if len(p) >= 2:
                        p = tuple(p)
                        paths.append(p)
        return paths

graph = Graph(vertexGraphDict)
nestedList= graph.find_all_paths(begin, end)

vertexGraphDict只是一个顶点字典作为键，其值是它所连接的其他顶点的列表。

我试图使用以下方法消除反向重复：

reversedLists = []
for item in nestedLists:
    if item in reversedLists:
        nestedLists.remove(item)
    else:
        revItem = item[::-1] 
        reversedLists.append(revItem)

这种方法很慢。在我的班级中删除了行p = tuple（p）后，我也尝试过revItem = list（reverse（item））;也很慢。在列表生成期间尝试这些方法可以节省总体时间，但不会加快消除过程，这是关键。

Answer 1

只有当最后一项低于第一项并且值为元组本身时，你才能构建一个OrderedDict，其中键是反向顺序的元组，然后得到OrderedDict的值列表：

from collections import OrderedDict
l = [
    ('77', '131', '212', '69'),
    ('69', '212', '131', '77'),
    ('1', '2', '3', '4'),
    ('4', '1', '2', '3'),
    ('4', '3', '2', '1')
]
list(OrderedDict((t[::-1] if t[-1] < t[0] else t, t) for t in l).values())

或者，如果您使用的是Python 3.7或更高版本，其中dict键是有序的，您可以使用dict代替OrderedDict：

list({t[::-1] if t[-1] < t[0] else t: t for t in l}.values())

返回：

[('69', '212', '131', '77'), ('4', '3', '2', '1'), ('4', '1', '2', '3')]

Answer 2

我的方法是将每个元组切换到一个列表，将其反转，将其切换回元组，并从列表中删除（反向）元组（如果它是其中的一部分）。

l = [
    ('77', '131', '212', '69'),
    ('69', '212', '131', '77'),
    ('1', '2', '3', '4'),
    ('4', '1', '2', '3'),
    ('4', '3', '2', '1')
]

for t in l:
    lr = list(t)
    lr.reverse()
    tr = tuple(lr)
    if tr in l:
        l.remove(tr)

print(l)

我不知道这会有多快，但输出就在这里。

[('77', '131', '212', '69'), ('1', '2', '3', '4'), ('4', '1', '2', '3')]

节省时间从Python中的嵌套列表中删除反向重复项？

问题描述投票：1回答：2

2个回答

最新问题

节省时间从Python中的嵌套列表中删除反向重复项？

问题描述 投票：1回答：2

2个回答

最新问题

问题描述投票：1回答：2