我有文字,这是一个例子
[04.1_Filialy 680031, Khabarovsk Territory, Khabarovsk, ul. District, 6, building b, office 3.] and [04.1_OGRN660050463454]
欲望输出
<address> 680031, Khabarovsk Territory, Khabarovsk, ul. District, 6, building b, office 3.<\address> and [04.1_OGRN660050463454]
我需要在str中使用re.findall(r'\[[\d\.]+_(?:Filialy|MN)[^]]+]
,将第一个r'\[[\d\.]+_(?:Filialy|MN)'
和r']'替换为<address>
和<\address>
。
我怎样才能做到这一点?
将[^]]+
放入捕获组()
并使用re.sub()
。
正则表达式:\[[\d\.]+_(?:Filialy|MN)([^]]+)\]
换人:<address>\1<\\address>
Python代码:
re.sub(r'\[[\d\.]+_(?:Filialy|MN)([^]]+)\]', r'<address>\1<\\address>', str)
输出:
<address> 680031, Khabarovsk Territory, Khabarovsk, ul. District, 6, building b, office 3.<\address> and [04.1_OGRN660050463454]