[英]How to read complex txt file with blocks of data and save it as csv file in python?
If i have a file organized like this 如果我有一个这样组织的文件
++++++++++++++
Country 1
**this sentence is not important.
**date 25.09.2017, also not important
*******
Address
**Office
Address A, 100 City. Country X
**work time 09h00-16h00<br>9h00-14h00
**www.example.com
**emal@example.com;
**012/345 67 89
**téléfax 123/456 67 89
*******
Address
**Home Office
Address A, 200 City. Country X
**email2@example.com;
**001/000 00 00
**téléfax 111/111 11 11
*******
Address
**Living address
Address 0, 123 City
**info@example.ch
**000/000 00 00
**téléfax 222/222 22 22
++++++++++++++
Country 2
**this sentence is not important.
**date 25.09.2017, also not important
*******
Address
**Office
AAA 11, 30 City
BBB 22, 30 City
**work time 08h00-12h30
**www.example.com
**info@example.com
**000/000 00 00
**téléfax 111/11 11 11
*******
ETC
And i want to put data in csv file with these columns: 我想将数据放在带有以下列的csv文件中:
Country (Line right after ++++++++++++++), Address (Line right after *******), Office (after **), WorkTime (after **), Website (after **), Email (after **), Phone (after **), Fax (after **)
How do I do it in Python? 如何在Python中完成? Problem is, in some lists there is missing data, so i know some rows in csv file will end up all messed up, but i don't mind doing some manual work tweaking the database after i do this.
问题是,在某些列表中缺少数据,因此我知道csv文件中的某些行最终将全部弄乱了,但我不介意在执行此操作之后进行一些手动工作来调整数据库。 Another problem is, country names vary, so i would need to use ++++++++++++++ as separator.
另一个问题是,国家/地区名称不同,因此我需要使用++++++++++++++作为分隔符。
I tried something like this 我尝试过这样的事情
import csv
with open('listofdata.txt', 'r') as FILE:
DATA = FILE.read()
LIST = DATA.split('++++++++++++++')
LIST2 = []
LIST3 = []
LIST4 = []
for ITEMS in LIST:
LIST2 = ITEMS.split('*******')
for items2 in LIST2:
LIST3 = items2.split('**')
LIST4.append(LIST3)
with open('file.csv', 'w') as CSV:
for ITEMS in LIST4:
csv.write(ITEMS)
But it doesn't work. 但这是行不通的。
ERROR: `Traceback (most recent call last): File "test.py", line 22, in csv.write(ITEMS) AttributeError: 'module' object has no attribute 'write' 错误:`追踪(最近一次通话最近):csv.write(ITEMS)中文件“ test.py”,第22行,AttributeError:“模块”对象没有属性“写”
` `
In the very last line you wrote your file object "csv" instead of "CSV", that was the reason there was an error. 在最后一行中,您写了文件对象“ csv”而不是“ CSV”,这就是出现错误的原因。
I added the procedure on how to use the csv module within python to your code. 我在代码中添加了有关如何在python中使用csv模块的过程。
All you have to do now is work on your parsing method. 现在您要做的就是解析方法。
Code: 码:
import csv
with open('listofdata.txt', 'r') as FILE:
DATA = FILE.read()
LIST = DATA.split('++++++++++++++')
LIST2 = []
LIST3 = []
LIST4 = []
for ITEMS in LIST:
LIST2 = ITEMS.split('*******')
for items2 in LIST2:
LIST3 = items2.split('**')
LIST4.append(LIST3)
with open('file.csv', 'w') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=',')
for ITEMS in LIST4:
spamwriter.writerow(ITEMS)
Output: 输出:
""
"
Country 1
","this sentence is not important.
","date 25.09.2017, also not important
"
"
Address
","Office
Address A, 100 City. Country X
","work time 09h00-16h00<br>9h00-14h00
","www.example.com
","emal@example.com;
","012/345 67 89
","téléfax 123/456 67 89
"
"
Address
","Home Office
Address A, 200 City. Country X
","email2@example.com;
","001/000 00 00
","téléfax 111/111 11 11
"
"
Address
","Living address
Address 0, 123 City
","info@example.ch
","000/000 00 00
","téléfax 222/222 22 22
"
"
Country 2
","this sentence is not important.
","date 25.09.2017, also not important
"
"
Address
","Office
AAA 11, 30 City
BBB 22, 30 City
","work time 08h00-12h30
","www.example.com
","info@example.com
","000/000 00 00
","téléfax 111/11 11 11
"
"
"
When you save to csv file use csv.writer . 当您保存到csv文件时,请使用csv.writer 。 But first you must prepare parser for structure of your
listofdata.txt
file and then you can save data to csv file. 但是,首先您必须准备解析器来构造
listofdata.txt
文件的结构,然后才能将数据保存到csv文件中。
Alternatively, you can use csv.DictWriter but you must prepare parser first anyway. 另外,您可以使用csv.DictWriter,但无论如何必须首先准备解析器。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.