正则表达式:查找文本中的所有子字符串

问题描述 投票:1回答:1

我有一个很长的文本,它是它们的一部分

C: state name of the Company in Russian: [03_SNYuLOOO IC "Story Group".]
). - [04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow, 
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B].

我需要找到这样的所有子串:

[03_SNYuLOOO IC "Story Group".]
[04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow, 
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B]

我试着用

re.findall(r'^\[\d{2}_[\s\S]+\]$', text)

但它返回空列表。我错了什么?

python regex
1个回答
2
投票

^$锚点需要整个字符串匹配模式,[\s\S]+尽可能多地匹配任何1+字符,抓住任何[]到达字符串的末尾,所以最后的]将匹配最右边的]串。

您可以使用以下正则表达式:

r'\[\d{2}_[^]]+]'

regex demo

细节

  • \[ - 文字[
  • \d{2} - 两位数
  • _ - 一个下划线
  • [^]]+ - 除了]之外的一个或多个字符
  • ] - 文字]

Python demo

import re
s='''C: state name of the Company in Russian: [03_SNYuLOOO IC "Story Group".]
). - [04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow, 
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B].'''
print(re.findall(r'\[\d{2}_[^]]+]', s))
# => ['[03_SNYuLOOO IC "Story Group".]', '[04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow, \nul. Krasnobogatyrskaya, 2, is built.\n2, floor 3. com. 11. Office B]']
© www.soinside.com 2019 - 2024. All rights reserved.