我正在尝试将大文本文件分成较小的文本文件。理想情况下,每个较小的输出文本文件应有 5 行,但如果第 5 行中不存在“END”关键字,则应移动到下一行,直到找到“END”关键字,然后将其创建为一个较小的输出文件。
以下是我正在处理的数据:
CONTINGENCY 'P11:-12.47:DEI:PURDUE CHP GEN'
SET BUS 249831 GENERATION TO 20 MW
END
CONTINGENCY 'P11:-12.47:DEI:PURDUE TG1-2 GENS'
SET BUS 249831 GENERATION TO 15.5 MW
END
CONTINGENCY 'P11:-13.2:DEI:TATE-LYLE BTM GENS'
OPEN BUS 249936
END
CONTINGENCY 'P11:0.342:DEI:08CR_SOL_GEN:1'
REMOVE MACHINE 1 FROM BUS 251904
END
在此示例中,第 5 行没有“END”关键字,因此它应该移动到有“END”关键字的第 6 行,并创建一个 6 行的小文本文件,下一步应该从第 7 行开始并遵循相同的过程。
目前我正在使用以下代码:
import glob
import pandas as pd
import math
import os
if __name__ == "__main__":
file_dir = os.path.dirname(__file__)
if file_dir != "":
os.getcwd()
read_file = glob.glob("*.con")
with open("combined.con", "wb") as outfile:
for f in read_file:
with open (f, "rb") as infile:
outfile.write(infile.read())
df0 = pd.read_csv (file_dir + '/combined.con')```
count = len(df0)
row_range = 5
block = count // row_range
for line in df0:
for i in range(block):
if not line.startswith("END"):
start = i * row_range
stop = (i+1) * row_range
while True:
row_range = row_range + 1
df2 = df0.iloc[start:stop]
df2.to_csv(f"Contingency_{i}.con", index=False)
break
Code is creating smaller text files but they are **not** ending with "END" keyword as intended.
Expected output is two smaller text files with following data:
CONTINGENCY 'P11:-12.47:DEI:PURDUE CHP GEN'
SET BUS 249831 GENERATION TO 20 MW
END
CONTINGENCY 'P11:-12.47:DEI:PURDUE TG1-2 GENS'
SET BUS 249831 GENERATION TO 15.5 MW
END
CONTINGENCY 'P11:-13.2:DEI:TATE-LYLE BTM GENS'
OPEN BUS 249936
END
CONTINGENCY 'P11:0.342:DEI:08CR_SOL_GEN:1'
REMOVE MACHINE 1 FROM BUS 251904
END
我之前看到过这个问题 - 但在我创建代码之前它就被关闭了。
我不使用 pandas,我不计算块,但我逐行读取并将它们添加到单独的列表中,然后检查列表是否有 5 行或更多行,以及最后一行是否有
END
。
other_list = []
index = 0
with open("combined.con") as infile:
for line in infile:
other_list.append(line)
if len(other_list) >= 5 and line.startwith('END'):
with open(f"Contingency_{i}.con", 'w') as outfile:
for item in other_list:
outfile.write(item)
other_list = []
index += 1
# make sure there is no data
if len(other_list) > 0:
with open(f"Contingency_{i}.con", 'w') as outfile:
for item in other_list:
outfile.write(item)