关于python:扁平浅嵌套列表的习惯用法:它是如何工作的?

Idiom for flattening a shallow nested list: how does it work?

我在正在处理的模块中发现了这段代码:

1
2
l = opaque_function()
thingys = [x for y in l for x in y]

我看不懂这个。通过实验,我可以确定它正在扁平化一个2级嵌套列表,但是syntex对我来说仍然是不透明的。它显然省略了一些可选的括号。

1
2
3
>>> l = [[1,2],[3,4]]
>>> [x for y in l for x in y]
[1, 2, 3, 4]

我的眼睛想把它解析为:[x for y in [l for x in y] ][ [x for y in l] for x in y ],但这两种方法都失败了,因为y没有被定义。

我该怎么读这个?

(我想我解释这件事时会感到非常尴尬。)


这曾经让我很困惑。您应该像嵌套循环一样读取它:

1
2
3
4
new_list = []
for y in l:
    for x in y:
        new_list.append(x)

变成

1
for y in l for x in y [do] new_list.append(x)

变成

1
[x for y in l for x in y]

从列表中显示文档:

When a list comprehension is supplied, it consists of a single expression followed by at least one for clause and zero or more for or if clauses. In this case, the elements of the new list are those that would be produced by considering each of the for or if clauses a block, nesting from left to right, and evaluating the expression to produce a list element each time the innermost block is reached.

因此,可以将表达式重写为:

1
2
3
4
thingys = []
for y in l:
    for x in y:
        thingys.append(x)


您应将此理解为:

1
2
3
for y in l:
    for x in y:
        yield x

这是生成器版本,但是所有的理解都有相同的基本语法:当x放在前面时,表达式的其余部分仍然从左到右阅读。起初我也被这一点搞糊涂了,希望它是反过来的,但一旦添加过滤表达式,它就有意义了:

1
2
3
4
5
6
>>> l = [[1,2,3,4,5], [1,"foo","bar"], [2,3]]
>>> [x for y in l
...    if len(y) < 4
...    for x in y
...    if isinstance(x, int)]
[1, 2, 3]

现在想象一下,必须把这整件事倒过来写:

1
2
3
4
[x if isinstance(x, int)
   for x in y
   if len(y) < 4
   for y in l]

即使对于经验丰富的Prolog程序员来说,这也会令人困惑,更不用说维护Python解析器的人员了:)

当前的语法也与haskell中的语法相匹配,它首先激发了列表理解。


1
2
3
4
5
6
7
lis=[x for y in l for x in y] is Equivalent to:


lis=[]
for y in l:
   for x in y:
      lis.append(x)