关于python:List comprehension甚至在理解范围之后重新命名。

List comprehension rebinds names even after scope of comprehension. Is this right?

理解与范围有一些意想不到的交互作用。这是预期的行为吗?

我有一个方法:

1
2
3
4
5
6
7
8
9
10
11
def leave_room(self, uid):
  u = self.user_by_id(uid)
  r = self.rooms[u.rid]

  other_uids = [ouid for ouid in r.users_by_id.keys() if ouid != u.uid]
  other_us = [self.user_by_id(uid) for uid in other_uids]

  r.remove_user(uid) # OOPS! uid has been re-bound by the list comprehension above

  # Interestingly, it's rebound to the last uid in the list, so the error only shows
  # up when len > 1

冒着发牢骚的风险,这是错误的残酷根源。当我编写新代码时,我只是偶尔发现由于重新绑定而产生的非常奇怪的错误——即使现在我知道这是一个问题。我需要制定一条规则,比如"总是在列表理解中用下划线作为临时变量的开头",但即使这样也不能证明是愚蠢的。

有这样一个随机的定时炸弹等待的事实否定了列表理解的所有好的"易用性"。


列表理解泄漏了python 2中的循环控制变量,而不是python 3中的循环控制变量。下面是guido van rossum(Python的创建者),解释了这背后的历史:

We also made another change in Python
3, to improve equivalence between list
comprehensions and generator
expressions. In Python 2, the list
comprehension"leaks" the loop control
variable into the surrounding scope:

1
2
3
x = 'before'
a = [x for x in 1, 2, 3]
print x # this prints '3', not 'before'

This was an artifact of the original
implementation of list comprehensions;
it was one of Python's"dirty little
secrets" for years. It started out as
an intentional compromise to make list
comprehensions blindingly fast, and
while it was not a common pitfall for
beginners, it definitely stung people
occasionally. For generator
expressions we could not do this.
Generator expressions are implemented
using generators, whose execution
requires a separate execution frame.
Thus, generator expressions
(especially if they iterate over a
short sequence) were less efficient
than list comprehensions.

However, in Python 3, we decided to
fix the"dirty little secret" of list
comprehensions by using the same
implementation strategy as for
generator expressions. Thus, in Python
3, the above example (after
modification to use print(x) :-) will
print 'before', proving that the 'x'
in the list comprehension temporarily
shadows but does not override the 'x'
in the surrounding scope.


是的,列表理解"泄漏"了python 2.x中的变量,就像for循环一样。

回想起来,这被认为是一个错误,用生成器表达式可以避免。编辑:正如MattB所指出的,当从python 3返回set和dictionary理解语法时,也避免了这种情况。

列表理解的行为必须像在Python2中那样保留,但它在Python3中是完全固定的。

这意味着:

1
2
3
4
5
list(x for x in a if x>32)
set(x//4 for x in a if x>32)         # just another generator exp.
dict((x, x//16) for x in a if x>32)  # yet another generator exp.
{x//4 for x in a if x>32}            # 2.7+ syntax
{x: x//16 for x in a if x>32}        # 2.7+ syntax

x始终是表达式的局部,而这些:

1
2
3
[x for x in a if x>32]
set([x//4 for x in a if x>32])         # just another list comp.
dict([(x, x//16) for x in a if x>32])  # yet another list comp.

在python 2.x中,都将x变量泄漏到周围的作用域中。

更新python 3.8(?):pep 572将引入:=赋值运算符,该运算符故意从理解和生成器表达式中泄漏出来!它主要由两个用例驱动:从早期终止功能(如any()all()中捕获一个"见证者"):

1
2
3
4
if any((comment := line).startswith('#') for line in lines):
    print("First comment:", comment)
else:
    print("There are no comments")

更新可变状态:

1
2
total = 0
partial_sums = [total := total + v for v in values]

具体范围见附录B。除非函数声明为nonlocalglobal,否则变量被分配到最靠近的deflambda周围。


是的,分配发生在那里,就像在for循环中一样。没有创建新的作用域。

这绝对是预期的行为:在每个循环中,值都绑定到您指定的名称。例如,

1
2
3
4
5
6
>>> x=0
>>> a=[1,54,4,2,32,234,5234,]
>>> [x for x in a if x>32]
[54, 234, 5234]
>>> x
5234

一旦认识到这一点,就很容易避免:不要为理解中的变量使用现有的名称。


有趣的是,这不会影响字典或集合理解。

1
2
3
4
5
6
7
8
9
10
11
12
>>> [x for x in range(1, 10)]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> x
9
>>> {x for x in range(1, 5)}
set([1, 2, 3, 4])
>>> x
9
>>> {x:x for x in range(1, 100)}
{1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9, 10: 10, 11: 11, 12: 12, 13: 13, 14: 14, 15: 15, 16: 16, 17: 17, 18: 18, 19: 19, 20: 20, 21: 21, 22: 22, 23: 23, 24: 24, 25: 25, 26: 26, 27: 27, 28: 28, 29: 29, 30: 30, 31: 31, 32: 32, 33: 33, 34: 34, 35: 35, 36: 36, 37: 37, 38: 38, 39: 39, 40: 40, 41: 41, 42: 42, 43: 43, 44: 44, 45: 45, 46: 46, 47: 47, 48: 48, 49: 49, 50: 50, 51: 51, 52: 52, 53: 53, 54: 54, 55: 55, 56: 56, 57: 57, 58: 58, 59: 59, 60: 60, 61: 61, 62: 62, 63: 63, 64: 64, 65: 65, 66: 66, 67: 67, 68: 68, 69: 69, 70: 70, 71: 71, 72: 72, 73: 73, 74: 74, 75: 75, 76: 76, 77: 77, 78: 78, 79: 79, 80: 80, 81: 81, 82: 82, 83: 83, 84: 84, 85: 85, 86: 86, 87: 87, 88: 88, 89: 89, 90: 90, 91: 91, 92: 92, 93: 93, 94: 94, 95: 95, 96: 96, 97: 97, 98: 98, 99: 99}
>>> x
9

但是,如上文所述,它已固定在3中。


对于Python2.6,当这种行为不可取时,可以使用一些变通方法。

1
2
3
4
5
6
7
8
9
10
# python
Python 2.6.6 (r266:84292, Aug  9 2016, 06:11:56)
Type"help","copyright","credits" or"license" for more information.
>>> x=0
>>> a=list(x for x in xrange(9))
>>> x
0
>>> a=[x for x in xrange(9)]
>>> x
8