如何在Python中读取包含数据块的复杂txt文件并将其保存为csv文件?

问题描述 投票:0回答:2

如果我有一个像这样组织的文件

++++++++++++++
Country 1

**this sentence is not important.
**date 25.09.2017, also not important
*******
Address
**Office

        Address A, 100 City. Country X
**work time 09h00-16h00<br>9h00-14h00
**www.example.com
**[email protected];
**012/345 67 89
**téléfax 123/456 67 89
*******
Address
**Home Office

        Address A, 200 City. Country X
**[email protected];
**001/000 00 00
**téléfax 111/111 11 11
*******
Address
**Living address

        Address 0, 123 City
**[email protected]
**000/000 00 00
**téléfax 222/222 22 22
++++++++++++++
Country 2

**this sentence is not important.
**date 25.09.2017, also not important
*******
Address
**Office

        AAA 11, 30 City 

        BBB 22, 30 City
**work time 08h00-12h30  
**www.example.com
**[email protected]
**000/000 00 00
**téléfax 111/11 11 11
*******

ETC

我想将数据放入带有这些列的 csv 文件中:

Country (Line right after ++++++++++++++), Address (Line right after *******), Office (after **), WorkTime (after **), Website (after **), Email (after **), Phone (after **), Fax (after **)

如何在 Python 中做到这一点?问题是,在某些列表中缺少数据,所以我知道 csv 文件中的某些行最终会变得混乱,但我不介意在执行此操作后进行一些手动调整数据库。另一个问题是,国家/地区名称各不相同,因此我需要使用 ++++++++++++++ 作为分隔符。

我尝试过这样的事情

import csv
with open('listofdata.txt', 'r') as FILE:
   DATA = FILE.read()

LIST = DATA.split('++++++++++++++')

LIST2 = []
LIST3 = []
LIST4 = []

for ITEMS in LIST:
    LIST2 = ITEMS.split('*******')    
    for items2 in LIST2:
        LIST3 = items2.split('**')
        LIST4.append(LIST3)


with open('file.csv', 'w') as CSV:
    for ITEMS in LIST4:
        csv.write(ITEMS)

但是没用。

错误:`回溯(最近一次调用最后一次): 文件“test.py”,第 22 行,位于 csv.write(项目) 属性错误:“模块”对象没有属性“写入”

`

python csv
2个回答
1
投票

在最后一行,您编写了文件对象“csv”而不是“CSV”,这就是出现错误的原因。

我将有关如何在 python 中使用 csv 模块的过程添加到您的代码中。

您现在所要做的就是研究您的解析方法。

代码:

import csv
with open('listofdata.txt', 'r') as FILE:
   DATA = FILE.read()

LIST = DATA.split('++++++++++++++')

LIST2 = []
LIST3 = []
LIST4 = []

for ITEMS in LIST:
    LIST2 = ITEMS.split('*******')
    for items2 in LIST2:
        LIST3 = items2.split('**')
        LIST4.append(LIST3)

with open('file.csv', 'w') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=',')
    for ITEMS in LIST4:
        spamwriter.writerow(ITEMS)

输出:

""

"
Country 1

","this sentence is not important.
","date 25.09.2017, also not important
"

"
Address
","Office

        Address A, 100 City. Country X
","work time 09h00-16h00<br>9h00-14h00
","www.example.com
","[email protected];
","012/345 67 89
","téléfax 123/456 67 89
"

"
Address
","Home Office

        Address A, 200 City. Country X
","[email protected];
","001/000 00 00
","téléfax 111/111 11 11
"

"
Address
","Living address

        Address 0, 123 City
","[email protected]
","000/000 00 00
","téléfax 222/222 22 22
"

"
Country 2

","this sentence is not important.
","date 25.09.2017, also not important
"

"
Address
","Office

        AAA 11, 30 City

        BBB 22, 30 City
","work time 08h00-12h30
","www.example.com
","[email protected]
","000/000 00 00
","téléfax 111/11 11 11
"

"
"

0
投票

保存到 csv 文件时使用 csv.writer。但首先您必须为

listofdata.txt
文件的结构准备解析器,然后您可以将数据保存到 csv 文件。

或者,您可以使用csv.DictWriter,但无论如何您必须先准备解析器。

© www.soinside.com 2019 - 2024. All rights reserved.