简体   繁体   English

如何读取带有数据块的复杂txt文件并将其另存为python中的csv文件?

[英]How to read complex txt file with blocks of data and save it as csv file in python?

If i have a file organized like this 如果我有一个这样组织的文件

++++++++++++++
Country 1

**this sentence is not important.
**date 25.09.2017, also not important
*******
Address
**Office

        Address A, 100 City. Country X
**work time 09h00-16h00<br>9h00-14h00
**www.example.com
**emal@example.com;
**012/345 67 89
**téléfax 123/456 67 89
*******
Address
**Home Office

        Address A, 200 City. Country X
**email2@example.com;
**001/000 00 00
**téléfax 111/111 11 11
*******
Address
**Living address

        Address 0, 123 City
**info@example.ch
**000/000 00 00
**téléfax 222/222 22 22
++++++++++++++
Country 2

**this sentence is not important.
**date 25.09.2017, also not important
*******
Address
**Office

        AAA 11, 30 City 

        BBB 22, 30 City
**work time 08h00-12h30  
**www.example.com
**info@example.com
**000/000 00 00
**téléfax 111/11 11 11
*******

ETC

And i want to put data in csv file with these columns: 我想将数据放在带有以下列的csv文件中:

Country (Line right after ++++++++++++++), Address (Line right after *******), Office (after **), WorkTime (after **), Website (after **), Email (after **), Phone (after **), Fax (after **)

How do I do it in Python? 如何在Python中完成? Problem is, in some lists there is missing data, so i know some rows in csv file will end up all messed up, but i don't mind doing some manual work tweaking the database after i do this. 问题是,在某些列表中缺少数据,因此我知道csv文件中的某些行最终将全部弄乱了,但我不介意在执行此操作之后进行一些手动工作来调整数据库。 Another problem is, country names vary, so i would need to use ++++++++++++++ as separator. 另一个问题是,国家/地区名称不同,因此我需要使用++++++++++++++作为分隔符。

I tried something like this 我尝试过这样的事情

import csv
with open('listofdata.txt', 'r') as FILE:
   DATA = FILE.read()

LIST = DATA.split('++++++++++++++')

LIST2 = []
LIST3 = []
LIST4 = []

for ITEMS in LIST:
    LIST2 = ITEMS.split('*******')    
    for items2 in LIST2:
        LIST3 = items2.split('**')
        LIST4.append(LIST3)


with open('file.csv', 'w') as CSV:
    for ITEMS in LIST4:
        csv.write(ITEMS)

But it doesn't work. 但这是行不通的。

ERROR: `Traceback (most recent call last): File "test.py", line 22, in csv.write(ITEMS) AttributeError: 'module' object has no attribute 'write' 错误:`追踪(最近一次通话最近):csv.write(ITEMS)中文件“ test.py”,第22行,AttributeError:“模块”对象没有属性“写”

` `

In the very last line you wrote your file object "csv" instead of "CSV", that was the reason there was an error. 在最后一行中,您写了文件对象“ csv”而不是“ CSV”,这就是出现错误的原因。

I added the procedure on how to use the csv module within python to your code. 我在代码中添加了有关如何在python中使用csv模块的过程。

All you have to do now is work on your parsing method. 现在您要做的就是解析方法。

Code: 码:

import csv
with open('listofdata.txt', 'r') as FILE:
   DATA = FILE.read()

LIST = DATA.split('++++++++++++++')

LIST2 = []
LIST3 = []
LIST4 = []

for ITEMS in LIST:
    LIST2 = ITEMS.split('*******')
    for items2 in LIST2:
        LIST3 = items2.split('**')
        LIST4.append(LIST3)

with open('file.csv', 'w') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=',')
    for ITEMS in LIST4:
        spamwriter.writerow(ITEMS)

Output: 输出:

""

"
Country 1

","this sentence is not important.
","date 25.09.2017, also not important
"

"
Address
","Office

        Address A, 100 City. Country X
","work time 09h00-16h00<br>9h00-14h00
","www.example.com
","emal@example.com;
","012/345 67 89
","téléfax 123/456 67 89
"

"
Address
","Home Office

        Address A, 200 City. Country X
","email2@example.com;
","001/000 00 00
","téléfax 111/111 11 11
"

"
Address
","Living address

        Address 0, 123 City
","info@example.ch
","000/000 00 00
","téléfax 222/222 22 22
"

"
Country 2

","this sentence is not important.
","date 25.09.2017, also not important
"

"
Address
","Office

        AAA 11, 30 City

        BBB 22, 30 City
","work time 08h00-12h30
","www.example.com
","info@example.com
","000/000 00 00
","téléfax 111/11 11 11
"

"
"

When you save to csv file use csv.writer . 当您保存到csv文件时,请使用csv.writer But first you must prepare parser for structure of your listofdata.txt file and then you can save data to csv file. 但是,首先您必须准备解析器来构造listofdata.txt文件的结构,然后才能将数据保存到csv文件中。

Alternatively, you can use csv.DictWriter but you must prepare parser first anyway. 另外,您可以使用csv.DictWriter,但无论如何必须首先准备解析器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 读取csv行并将其另存为单独的txt文件,命名为行-python - Read csv lines and save it as seperate txt file, named as a line - python 如何从csv读取没有标题的列并使用Python将输出保存在txt文件中? - How to read a column without header from csv and save the output in a txt file using Python? 如何最有效地读取 python 中包含复数的 txt 文件? - How to most effectively read a txt file in python that features complex numbers? 将复杂的 txt 文件(包括数组)读取到 Python - Read a complex txt file (includes arrays) to Python 使用 python 代码从文件 txt 读取数据并将其保存在列表中 - Read and save data from file txt in a list with python code 如何将从.txt文件中读取的这些元素保存到python中的数组/矩阵中 - How to save these elements read from a .txt file into an array/matrix in python 读取.txt文件并将选择性数据导出到.csv - Read .txt file and export selective data to .csv Python 读取txt文件并将其保存为 json 与密钥 - Python read txt file and save it as json with keys 如何在Python中将直方图数据保存到CSV文件中? - How to save histogram data in CSV file in Python? 如何保留从a.csv文件中删除第二列并将文件的rest保存到Python中的a.txt文件 - How to keep the remove the second column from a .csv file and save the rest of the file to a .txt file in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM