我是Python新手。我正忙着做作业。我正在尝试根据关键字列表将 10,000 行文本文件拆分为多个文件。
input.txt 看起来像这样:
Name: Apple
Type: Fruits
Description:...
Name: Orange
Type: Fruits
Description:...
Name: Yellow
Type: Colour
Description:...
Name: Apple
Type: Fruits
Description:...
Name: Orange
Type: Fruits
Description:...
Name: Yellow
Type: Colour
Description:...
关键词:
Apple
Orange
Yellow
预期输出文件:
苹果.txt
Type: Fruits
Description:
0范围.txt
Type: Fruits
Description:
黄色.txt
Type: Colour
Description:
但是我当前的代码只能在密钥是“Apple”时才能拆分。我不知道如何将其修改为一系列关键字。
key = ['Apple']
outfile = None
fno = 0
lno = 0
with open('input.txt') as infile:
while line := infile.readline():
lno += 1
if outfile is None:
fno += 1
outfile = open(f'{fno}.txt', 'w')
outfile.write(line)
if key in line:
print(f'"{key}" found in line {lno}')
outfile.close()
outfile = None
if outfile:
outfile.close()
编辑:它应该打印每个关键字的第一条记录。
这是您的代码的更惯用的版本。它不会对关键字列表进行硬编码;它只是简单地拾取
Name:
之后的内容
seen = set()
outfile = None
with open('input.txt') as infile:
for line in infile:
if line.startswith(' Name: '):
keyword = line[len(' Name: '):-1]
if keyword not in seen:
outfile = open(f'{keyword}.txt', 'w')
seen.add(keyword)
if outfile is not None:
if line.strip() == '':
outfile.close()
outfile = None
else:
outfile.write(line)
if outfile is not None:
outfile.close()
你从来没有用
lno
做过任何有用的事情,但如果你出于某种原因想要它,获取行号的惯用方法是
for lno, line in enumerate(infile, start=1):
您的示例
input.txt
在每行的开头显示一个空格。如果转录不正确,显然要进行相应的调整。