在Python 3中打开文件,重新格式化并写入新文件

问题描述 投票:1回答:2

我是Python的新手(几周)。我正在Coursera上做Python for Everybody课程,并决定将一些想法扩展到我想写的应用程序中。

我想采用写入引号的txt文件,删除一些不必要的字符和换行符,然后将新格式化的字符串写入新文件。此文件将用于在终端中显示随机引号(此处不需要后者)。

txt文件中的条目如下所示:

“The road to hell is paved with works-in-progress.”
—Philip Roth, WD some other stuff here
“Some other quote.”
—Another Author, Blah blah

我想将以下内容写入新文件:

"The road to hell is paved with works-in-progress." —Phillip Roth
"Some other quote." —Another Author

我想删除引号和作者之间的换行符,并用空格替换。我还想在作者之后删除逗号中的所有内容(所以它只是:引用[space]作者)。该文件有73个,所以我想通过文件进行这些更改,然后使用新格式化的引号写入新文件。最终的输出只是:“等等等等” - 作者

我已经尝试了各种方法,目前正在通过for循环中的文件将两个段写入我想加入列表的列表。但是我被困住了,也不确定这是否有点过分。我们将非常感激地提供任何帮助。既然我有两个列表,我似乎无法加入它们,我不确定这样做是否正确。有什么想法吗?

代码到目前为止:

fh = open('quotes_source.txt')


quote = list()
author = list()

for line in fh:

    # Find quote segment and assign to a string variable
    if line.startswith('“'):
        phrase_end = line.find('”')+1
        phrase_start = line.find('“')
        phrase = line[phrase_start:phrase_end]
        quote.append(phrase)

    # Find author segment and assign to a string variable
    if line.startswith('—'):
        name_end = line.find(',')
        name = line[:name_end]
        author.append(name)

print(quote)
print(author)
python string list file string-concatenation
2个回答
1
投票

你不需要像这样的简单任务的正则表达式,你实际上是在正确的轨道上,但你自己纠结于试图解析一切,而不是只是流式传输文件和决定在哪里切割。

根据您的数据,您希望切换以(表示作者)开头的行,并且您希望从第一个逗号开始剪切该行。据推测,您也想删除空行。因此,一个简单的流修饰符看起来像:

# open quotes_source.txt for reading and quotes_processed.txt for writing
with open("quotes_source.txt", "r", encoding="utf-8") as f_in,\
        open("quotes_processed.txt", "w", encoding="utf-8") as f_out:
    for line in f_in:  # read the input file line by line
        line = line.strip()  # clear out all whitespace, including the new line
        if not line:  # ignore blank lines
            continue
        if line[0] == "—":  # we found the dash!
            # write space, everything up to the first comma and a new line in the end
            f_out.write(" " + line.split(",", 1)[0] + "\n")
        else:
            f_out.write(line)  # a quote line, write it immediately

这就是它的全部内容。只要数据中没有其他新行,它就会产生您想要的结果,即包含以下内容的quotes_source.txt文件:

“The road to hell is paved with works-in-progress.”
—Philip Roth, WD some other stuff here

“The only thing necessary for the triumph of evil is for good men to do nothing.”
—Edmund Burke, whatever there is

“You know nothing John Snow.”
—The wildling Ygritte, "A Dance With Dragons" - George R.R. Martin

它将生成一个quotes_processed.txt文件,其中包含:

“The road to hell is paved with works-in-progress.” —Philip Roth
“The only thing necessary for the triumph of evil is for good men to do nothing.” —Edmund Burke
“You know nothing John Snow.” —The wildling Ygritte

1
投票
quote_line="“The road to hell is paved with works-in-progress.”\n—Philip Roth, WD some other stuff here\n"
quote_line=quote_line.replace("\n","")
quote_line=quote_line.split(",")

formatted_quote=""

如果您不确定该行中只有一个逗号。

  • “针锋相对。”\ n-某人罗斯,等等等等#只有一个逗号
  • “针锋相对,针对山雀”\ n-某个人罗斯,等等等等。#不止一个逗号 len_quote_list=len(quote_line)-1 for part in range(0,len_quote_list): formatted_quote+=quote_line[part] formatted_quote+="\n"

要么

formatted_quote=quote_line[0]+"\n"
© www.soinside.com 2019 - 2024. All rights reserved.