Python regex在cygwin上表现异常

Python regex behaves unexpectedly on cygwin

问题：

我正在尝试使用Python提取html页面的主体。
从表面上看-它几乎是微不足道的(即：

1	'<body.?>(.)</body>', re.IGNORECASE\|re.DOTALL)

确实，一些在线正则表达式验证程序证实了上述有效性。
但是，当我尝试在环境中运行以下脚本时，匹配项为NoneType。有什么想法吗？

测试脚本：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

#!/bin/env python

import re
import urllib2

def display_html(f):
print f.read()

def get_body(text):
p = re.compile('<body.*?>(.*)</body>', re.IGNORECASE|re.DOTALL)
print p, type(p)
m = p.match(text)
print m, type(m)

def get_html_text(url):
f = urllib2.urlopen(url)
return f

def to_text(f):
return f.read()

if __name__ =="__main__":
url ="http://www.ibm.com/us/en/" # A nicely formatted known page
f = get_html_text(url)
html_text = to_text(f)
body = get_body(html_text)

输出：

1 2	<_sre.SRE_Pattern object at 0xffe245c0> <type '_sre.SRE_Pattern'> None <type 'NoneType'>

我的环境：

Python 2.7.3，CYGWIN_NT-6.1-WOW64 1.7.22(0.268 / 5/3)2013-07-22 17:06 i686 Cygwin，Windows 7 x86-64。