关于python：在维护tuple的顺序的同时，如何根据tuple的索引值从列表中删除重复的tuple？

How can I remove duplicate tuples from a list based on index value of tuple while maintaining the order of tuple?

本问题已经有最佳答案，请猛点这里访问。

我想删除索引0中除第一次出现外具有相同值的元组。我看了其他类似的问题，但没有得到我想要的具体答案。有人能帮我吗？下面是我的尝试。

1
2
3
4
5
6
7
8

from itertools import groupby
import random
Newlist = []

abc = [(1,2,3), (2,3,4), (1,0,3),(0,2,0), (2,4,5),(5,4,3), (0,4,1)]

Newlist = [random.choice(tuple(g)) for _, g in groupby(abc, key=lambda x: x[0])]
print Newlist

我的预期产量：[(1,2,3), (2,3,4), (0,2,0), (5,4,3)]。

相关讨论

一个简单的方法是循环列表并跟踪已经找到的元素：

1
2
3
4
5
6
7
8
9

abc = [(1,2,3), (2,3,4), (1,0,3),(0,2,0), (2,4,5),(5,4,3), (0,4,1)]
found = set()
NewList = []
for a in abc:
if a[0] not in found:
NewList.append(a)
found.add(a[0])
print(NewList)
#[(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]

found是一个set。在每次迭代中，我们检查tuple中的第一个元素是否已经在found中。如果没有，我们将整个元组附加到NewList。在每次迭代结束时，我们将元组的第一个元素添加到found中。

相关讨论

itertools配方(python 2:itertools配方，但在本例中基本上没有区别)包含一个用于此的配方，它比@pault的实现更通用。它还使用了set：

Python 2：

1	from itertools import ifilterfalse as filterfalse

Python 3：

1	from itertools import filterfalse

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def unique_everseen(iterable, key=None):
"List unique elements, preserving order. Remember all elements ever seen."
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
# unique_everseen('ABBCcAD', str.lower) --> A B C D
seen = set()
seen_add = seen.add
if key is None:
for element in filterfalse(seen.__contains__, iterable):
seen_add(element)
yield element
else:
for element in iterable:
k = key(element)
if k not in seen:
seen_add(k)
yield element

使用它：

1
2
3
4

abc = [(1,2,3), (2,3,4), (1,0,3),(0,2,0), (2,4,5),(5,4,3), (0,4,1)]
Newlist = list(unique_everseen(abc, key=lambda x: x[0]))
print Newlist
# [(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]

由于set.add方法的缓存(仅当abc很大时才真正相关)，这应该稍微快一点，而且应该更一般，因为它使key函数成为一个参数。

除此之外，我在注释中已经提到的同样的限制也适用：只有当元组的第一个元素实际上是可哈希的(当然，像在给定的例子中，这些数字是可哈希的)时，这才有效。

相关讨论

@Patrickhaugh声称：

but the question is explicitly about maintaining the order of the
tuples. I don't think there's a solution using groupby

我从来没有错过过使用groupby()的机会。我的解决方案是无排序(一次或两次)：

1
2
3
4
5
6
7

from itertools import groupby, chain

abc = [(1, 2, 3), (2, 3, 4), (1, 0, 3), (0, 2, 0), (2, 4, 5), (5, 4, 3), (0, 4, 1)]

Newlist = list((lambda s: chain.from_iterable(g for f, g in groupby(abc, lambda k: s.get(k[0]) != s.setdefault(k[0], True)) if f))({}))

print(Newlist)

产量

1
2
3

% python3 test.py
[(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]
%

使用OrderedDict的更好选择：

1
2
3
4
5
6
7
8

from collections import OrderedDict

abc = [(1,2,3), (2,3,4), (1,0,3), (0,2,0), (2,4,5),(5,4,3), (0,4,1)]
d = OrderedDict()
for t in abc:
d.setdefault(t[0], t)
abc_unique = list(d.values())
print(abc_unique)

输出：

1	[(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]

简单但效率不高：

1
2
3

abc = [(1,2,3), (2,3,4), (1,0,3), (0,2,0), (2,4,5),(5,4,3), (0,4,1)]
abc_unique = [t for i, t in enumerate(abc) if not any(t[0] == p[0] for p in abc[:i])]
print(abc_unique)

输出：

1	[(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]

相关讨论

要正确使用groupby，必须对序列进行排序：

1 2	>>> [next(g) for k,g in groupby(sorted(abc, key=lambda x:x[0]), key=lambda x:x[0])] [(0, 2, 0), (1, 2, 3), (2, 3, 4), (5, 4, 3)]

或者，如果您需要示例的精确顺序(即保持原始顺序)：

1 2	>>> [t[2:] for t in sorted([next(g) for k,g in groupby(sorted([(t[0], i)+t for i,t in enumerate(abc)]), lambda x:x[0])], key=lambda x:x[1])] [(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]

这里的诀窍是在groupby()步骤之后添加一个字段来保持要恢复的原始顺序。

编辑：再短一点：

1 2	>>> [t[1:] for t in sorted([next(g)[1:] for k,g in groupby(sorted([(t[0], i)+t for i,t in enumerate(abc)]), lambda x:x[0])])] [(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]

相关讨论