在python中用regex搜索句子中的模式。

Question

我想捕捉某个短语后面的数字，也想捕捉感兴趣的数字的开始和结束索引。

下面是一个例子。

text = The special code is 034567 in this particular case and not 98675

在这个例子中，我感兴趣的是捕捉到的数字 034657 这句话之后 special code 以及数字的起始和结束索引。034657.

我的代码是：

p = re.compile('special code \s\w.\s (\d+)')
re.search(p, text)

但这不符合任何东西。你能解释一下原因吗，我应该如何纠正？

Answer 1

您的表达式与空格和任何空格相匹配，并使用了 \s 模式，然后 \w. 匹配任何单词字符和除换行符以外的任何字符，然后再次匹配。\s 需要两个whitespaces，即任意whitespace和一个空格。

您可以简单地使用以下方法匹配任何1个以上的空白字符 \s+ 词之间，并匹配任何一块非空格，而非 \w.，您可以使用 \S+.

使用

import re
text = 'The special code is 034567 in this particular case and not 98675'
p = re.compile(r'special code\s+\S+\s+(\d+)')
m = p.search(text)
if m:
    print(m.group(1)) # 034567
    print(m.span(1))  # (20, 26)

见 Python演示和搜索引擎演示.

Answer 2

使用 re.findall 与捕捉组。

text = "The special code is 034567 in this particular case and not 98675"
matches = re.findall(r'\bspecial code (?:\S+\s+)?(\d+)', text)
print(matches)

这个指纹。

['034567']

在python中用regex搜索句子中的模式。

问题描述投票：0回答：1

1个回答

最新问题

在python中用regex搜索句子中的模式。

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1