输入文本文件包含:
<html>
<header>
<title>This is a title</title>
</header>
<body>
<div>This is a div <div>This is a nested div</div></div>
</body>
</html>
并且我想将以下内容输出到另一个文本文件:
<l>
<r>
<e>This is a title</e>
</r>
<y>
<v>This is a div <v>This is a nested div</v></v>
</y>
</l>
在python中使用Regex,我该怎么做?更新!!!!我已经尝试过这样的<>:
import re
def run():
with open('input.txt') as f:
fout = open('output.txt', 'w')
count = 0
for line in f:
if not line:
continue
pat = re.findall('<[a-zA-Z]+>',line)
for l in pat:
y = re.sub('<[a-zA-Z]+>', '<{}>'.format(l[-2]), line, count=0, flags=0)
fout.write(y)
我希望现在提供解决方案为时不晚。这是我的代码: