关于python:查找两个嵌套列表的交集?

Find intersection of two nested lists?

我知道如何得到两个简单列表的交集:

1
2
3
b1 = [1,2,3,4,5,9,11,15]
b2 = [4,5,6,7,8]
b3 = [val for val in b1 if val in b2]

1
2
3
4
def intersect(a, b):
    return list(set(a) & set(b))

print intersect(b1, b2)

但是当我必须找到嵌套列表的交集时,我的问题就开始了:

1
2
c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]

最后,我希望收到:

1
c3 = [[13,32],[7,13,28],[1,6]]

你们能帮我一把吗?

相关的

  • 扁平化python中的浅列表


你不需要定义交集。它已经是布景中一流的一部分了。

1
2
3
4
>>> b1 = [1,2,3,4,5,9,11,15]
>>> b2 = [4,5,6,7,8]
>>> set(b1).intersection(b2)
set([4, 5])


如果你想要:

1
2
3
c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
c3 = [[13, 32], [7, 13, 28], [1,6]]

下面是针对python 2的解决方案:

1
c3 = [filter(lambda x: x in c1, sublist) for sublist in c2]

在python 3中,filter返回一个iterable而不是list,因此需要用list()包装filter调用:

1
c3 = [list(filter(lambda x: x in c1, sublist)) for sublist in c2]

说明:

过滤部分获取每个子列表的项,并检查它是否在源列表C1中。对C2中的每个子列表执行列表理解。


对于只想找到两个列表交叉点的人,询问者提供了两种方法:

1
2
3
b1 = [1,2,3,4,5,9,11,15]
b2 = [4,5,6,7,8]
b3 = [val for val in b1 if val in b2]

and

1
2
3
4
def intersect(a, b):
     return list(set(a) & set(b))

print intersect(b1, b2)

但是有一种混合方法效率更高,因为您只需要在列表/集合之间进行一次转换,而不是三次:

1
2
3
4
b1 = [1,2,3,4,5]
b2 = [3,4,5,6]
s2 = set(b2)
b3 = [val for val in b1 if val in s2]

这将在o(n)中运行,而他涉及列表理解的原始方法将在o(n^2)中运行。


功能方法:

1
2
3
input_list = [[1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7]]

result = reduce(set.intersection, map(set, input_list))

它可以应用于更一般的1+列表的情况


纯列表理解版本

1
2
3
>>> c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
>>> c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
>>> c1set = frozenset(c1)

扁平变量:

1
2
>>> [n for lst in c2 for n in lst if n in c1set]
[13, 32, 7, 13, 28, 1, 6]

嵌套变体:

1
2
>>> [[n for n in lst if n in c1set] for lst in c2]
[[13, 32], [7, 13, 28], [1, 6]]

运算符取两个集合的交集。

_1,2,3_&;2,3,4_OUT〔1〕:{ 2, 3 }


两个列表交叉的一种方法是:

1
[x for x in list1 if x in list2]


由于定义了intersect,基本的清单理解就足够了:

1
2
3
>>> c3 = [intersect(c1, i) for i in c2]
>>> c3
[[32, 13], [28, 13, 7], [1, 6]]

得益于S.Lott的评论和TM的相关评论:

1
2
3
>>> c3 = [list(set(c1).intersection(i)) for i in c2]
>>> c3
[[32, 13], [28, 13, 7], [1, 6]]

你应该使用这段代码(摘自http://kogs-www.informatik.uni-hamburg.de/~meine/python-tricks),代码是未经测试的,但我确信它是有效的:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
def flatten(x):
   """flatten(sequence) -> list

    Returns a single, flat list which contains all elements retrieved
    from the sequence and all recursively contained sub-sequences
    (iterables).

    Examples:
    >>> [1, 2, [3,4], (5,6)]
    [1, 2, [3, 4], (5, 6)]
    >>> flatten([[[1,2,3], (42,None)], [4,5], [6], 7, MyVector(8,9,10)])
    [1, 2, 3, 42, None, 4, 5, 6, 7, 8, 9, 10]"""


    result = []
    for el in x:
        #if isinstance(el, (list, tuple)):
        if hasattr(el,"__iter__") and not isinstance(el, basestring):
            result.extend(flatten(el))
        else:
            result.append(el)
    return result

展开列表后,按常规方式执行交集:

1
2
3
4
5
6
7
c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]

def intersect(a, b):
     return list(set(a) & set(b))

print intersect(flatten(c1), flatten(c2))


鉴于:

1
2
3
> c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]

> c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]

我发现下面的代码工作得很好,如果使用set操作,可能会更简洁:

1
> c3 = [list(set(f)&set(c1)) for f in c2]

它得到:

1
> [[32, 13], [28, 13, 7], [1, 6]]

如果需要订购:

1
> c3 = [sorted(list(set(f)&set(c1))) for f in c2]

我们得到:

1
> [[13, 32], [7, 13, 28], [1, 6]]

顺便说一句,对于更具Python风格的样式,这个也很好:

1
> c3 = [ [i for i in set(f) if i in c1] for f in c2]

你认为[1,2][1, [2]]相交吗?也就是说,它只是你关心的数字,还是列表结构?

如果只有数字,研究如何"扁平"列表,然后使用set()方法。


我不知道我回答你的问题是否迟到了。在阅读了您的问题之后,我提出了一个函数intersect(),可以同时处理列表和嵌套列表。我用递归来定义这个函数,这是非常直观的。希望这是你想要的:

1
2
3
4
5
6
7
8
9
def intersect(a, b):
    result=[]
    for i in b:
        if isinstance(i,list):
            result.append(intersect(a,i))
        else:
            if i in a:
                 result.append(i)
    return result

例子:

1
2
3
4
5
6
7
8
9
>>> c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
>>> c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
>>> print intersect(c1,c2)
[[13, 32], [7, 13, 28], [1, 6]]

>>> b1 = [1,2,3,4,5,9,11,15]
>>> b2 = [4,5,6,7,8]
>>> print intersect(b1,b2)
[4, 5]

我也在寻找一种方法,最终结果是这样:

1
2
3
4
5
def compareLists(a,b):
    removed = [x for x in a if x not in b]
    added = [x for x in b if x not in a]
    overlap = [x for x in a if x in b]
    return [removed,added,overlap]


要定义正确考虑元素基数的交集,请使用Counter

1
2
3
4
5
6
from collections import Counter

>>> c1 = [1, 2, 2, 3, 4, 4, 4]
>>> c2 = [1, 2, 4, 4, 4, 4, 5]
>>> list((Counter(c1) & Counter(c2)).elements())
[1, 2, 4, 4, 4]

1
2
3
4
5
c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
c3 = [list(set(i) & set(c1)) for i in c2]
c3
[[32, 13], [28, 13, 7], [1, 6]]

对我来说,这是非常优雅和快捷的方式。)


1
2
3
4
5
6
7
8
c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]

c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]

c3 = [list(set(c2[i]).intersection(set(c1))) for i in xrange(len(c2))]

c3
->[[32, 13], [28, 13, 7], [1, 6]]


我们可以使用set方法:

1
2
3
4
5
6
7
8
9
c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]

   result = []
   for li in c2:
       res = set(li) & set(c1)
       result.append(list(res))

   print result

1
2
3
4
# Problem:  Given c1 and c2:
c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
# how do you get c3 to be [[13, 32], [7, 13, 28], [1, 6]] ?

有一种方法可以设置不涉及集合的c3

1
2
3
c3 = []
for sublist in c2:
    c3.append([val for val in c1 if val in sublist])

但如果您只想使用一行,可以这样做:

1
c3 = [[val for val in c1 if val in sublist]  for sublist in c2]

这是一个列表理解里面的列表理解,这有点不寻常,但我认为你不应该有太多的麻烦遵循它。