使用Python分割/切片文本文件

问题描述 投票:0回答:1

我正在学习Python,我一直在尝试将这个txt文件拆分为多个文件,并在每行开头按切片字符串分组。

目前我有两个问题:

  1. 字符串可以有 5 或 6 个字符,末尾用空格标记。(如 WSON33 和 JHSF3 等......)

    这是我要拆分的文件的示例(第一行是标题):

    H24/06/202000003TORDISTD 
    BWSON33      0803805000000000016400000003250C000002980002415324C1 0000000000000000
    BJHSF3       0804608800000000003500000000715V000020280000031810C1 0000000000000000
    
  2. 我带来了很多代码,但我无法将所有内容放在一起,这样就可以工作了:

    这里的代码是我从另一篇文章改编而来的,它可以分解为多个文件,但在开始编写文件之前需要对行进行排序。我还需要复制每个文件中的标头,而不是将其隔离为一个文件。

    with open('tordist.txt', 'r') as fin:
    
    # group each line in input file by first part of split
    for i, (k, g) in enumerate(itertools.groupby(fin, lambda l: l.split()[0]),1):
        # create file to write to suffixed with group number - start = 1
        with open('{0} tordist.txt'.format(i), 'w') as fout:
    
            # for each line in group write it to file
            for line in g:
                fout.write(line.strip() + '\n')
    
python
1个回答
0
投票

据我所知,您有一个包含多行的文本文件,其中每行都以 5 或 6 个字符的短字符串开头。 听起来您希望以同一字符串开头的所有行都进入同一个文件,以便在代码运行后您将拥有与唯一的起始字符串一样多的新文件。 准确吗?

和你一样,我对 python 相当陌生,所以我确信有更紧凑的方法可以做到这一点。 下面的代码多次循环遍历该文件,并在与文本和 python 文件所在的文件相同的文件夹中创建新文件。

# code which separates lines in a file by an identifier,
#and makes new files for each identifier group

filename = input('type filename')
if len(filename) < 1:
  filename = "mk_newfiles.txt"
filehandle = open(filename)

#This chunck loops through the file, looking at the beginning of each line,
#and adding it to a list of identifiers if it is not on the list already.
Unique = list()
for line in filehandle:
#like Lalit said, split is a simple way to seperate a longer string
  line = line.split()
  if line[0] not in Unique:
      Unique.append(line[0])

#For each item in the list of identifiers, this code goes through
#the file, and if a line starts with that identifier then it is
#added to a new file.
for item in Unique:
    #this 'if' skips the header, which has a '/' in it
    if '/' not in item:
        # the .seek(0) 'rewinds' the iteration variable, which is apperently needed
        #needed if looping through files multiple times
        filehandle.seek(0)

        #makes new file
        newfile = open(str(item) + ".txt","w+")

        #inserts header, and goes to next line
        newfile.write(Unique[0])
        newfile.write('\n')

        #goes through old file, and adds relevant lines to new file
        for line in filehandle:
            split_line = line.split()
            if item == split_line[0]:
                newfile.write(line)

print(Unique)
© www.soinside.com 2019 - 2024. All rights reserved.