正则表达式：查找文本中的所有子字符串

Question

我有一个很长的文本，它是它们的一部分

C: state name of the Company in Russian: [03_SNYuLOOO IC "Story Group".]
). - [04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow, 
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B].

我需要找到这样的所有子串：

[03_SNYuLOOO IC "Story Group".]
[04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow, 
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B]

我试着用

re.findall(r'^\[\d{2}_[\s\S]+\]$', text)

但它返回空列表。我错了什么？

Answer 1

^和$锚点需要整个字符串匹配模式，[\s\S]+尽可能多地匹配任何1+字符，抓住任何[和]到达字符串的末尾，所以最后的]将匹配最右边的]串。

您可以使用以下正则表达式：

r'\[\d{2}_[^]]+]'

见regex demo

细节

\[ - 文字[
\d{2} - 两位数
_ - 一个下划线
[^]]+ - 除了]之外的一个或多个字符
] - 文字]。

见Python demo：

import re
s='''C: state name of the Company in Russian: [03_SNYuLOOO IC "Story Group".]
). - [04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow, 
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B].'''
print(re.findall(r'\[\d{2}_[^]]+]', s))
# => ['[03_SNYuLOOO IC "Story Group".]', '[04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow, \nul. Krasnobogatyrskaya, 2, is built.\n2, floor 3. com. 11. Office B]']

正则表达式：查找文本中的所有子字符串

问题描述投票：1回答：1

1个回答

最新问题

正则表达式：查找文本中的所有子字符串

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1