关于python：正则表达式-检查字符串中以特定字符串开头的单词

Regular expressions - check for a word within a string that startswith “Something”

我将下面的用户代理头作为字符串

"Mozilla/5.0(Macintosh；Intel Mac OS X 10_12_5)AppleWebKit/603.2.4(khtml，类似gecko)版本/10.1.1 Safari/603.2.4"

我想在这个字符串中搜索任何以"version"开头的单词，但我还想获取整个"word"，它是其中的一部分，因此，在这个例子中，我们有"version/10.1.1"。我目前的regex查找只是返回"版本"…所以任何专业的regex提示都会很好。

这是我尝试的代码：

1
2
3
4
5
6
7

import re

http_user_agent ="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/603.2.4 (KHTML, like Gecko) Version/10.1.1 Safari/603.2.4"

if 'Safari' in http_user_agent and 'Mobile' not in http_user_agent:
version = re.compile(r'\b({0}).*?'.format('Version'), flags=re.IGNORECASE).search(http_user_agent)
print(version.group(0))

version.group(0)的打印当前仅为"version"…帮助！

这个regex很管用，但感觉有点懒惰：

1	(Version.*? )

在模式结束时，你可以安全地将它从模式中移除，并将其添加到模式中。

1 2	version = re.compile(r'\b{0}\S*'.format('Version'), flags=re.IGNORECASE).search(http_user_agent) ^^^

See the Python Demo Yielding Version/10.1.1as output.

注：你不需要在Version周围捕捉一个捕捉组，因此我还建议从模式中去除括号。

注：您可以根据/精确的模式，并使用

ZZU1

见另一只PythonDemo和一个Regex Demo。

细节

一个字的边界
文学子系
法国电力公司
数字化的
不捕捉组匹配0或更多(由于EDOCX1&12)
- 嫁妆，嫁妆
- 数字化的

利用后续行动：

1
2
3
4
5
6
7

import re

http_user_agent ="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/603.2.4 (KHTML, like Gecko) Version/10.1.1 Safari/603.2.4"

if 'Safari' in http_user_agent and 'Mobile' not in http_user_agent:
version = re.compile(r'\b({0}).*?'.format('Version[/\.\d]*\s'), flags=re.IGNORECASE).search(http_user_agent)
print(version.group(0))

我们在这里改变了Regex Versionto Version[/\.\d]*\sto include number"and／till the space character.