简体   繁体   中英

Converting Text file in csv using python

在此处输入图像描述

How to convert the following text file into csv files having same columns as User, Name, Company, Session, Description, Start and End

Data:05-20-21[15:55, Eur] User History by user page:1

User| Name | Comapny | Session| Description | Start | End | | | | | Date | Time| Date|Time|

Acanter|ANdy Canter| 135 |Ott | ttstpdeman | 05-19-21|07:48|05-19-21|08:13 | | 135 | ttspt|Thin client| 05-19-21|07:48|05-19-21 |08:13

Date: 06-20-21[15:55, Eur] User History by user page: 2

I assume you read the file line by line using a for loop. I would try to parse and convert it in three steps:

Step 1 : header (from start to the first line consisting of only "-")

This one is quite special. You can use a counter to indicate the current line and extract the relevant information using split() and strip() .

Step 2 : column names (from the first line consisting of only "-" to the first line consisting of only "-" and "+")

Try reading each line and if it does not contain a "-" then use split("|") to split into the cells and strip() to get rid of the spaces. After that you only have to combine the cells with the separator of your choice and write them to your file.

The only issue here are the combined cells (start and end) which you could handle similar to Step 1 if you always have the same column names or you can try to go by the start-locations of each cell by reading the part twice: one time for reading the start-locations of each cell (all occurrences of "|") and one time for the data and conversion. In that case you can iterate over the start-locations for each line and test if each line has a "|" at those locations and if not you ignore the cell if not you can create a sub-string starting at your start-location and split to get the data in the cell ( line[start:].split("|", 1)[0] )

Step 3 : data (from the first line consisting of only "-" and "+" to the end)

Read each line and collapse each occurrence of multiple spaces into one while replacing each "|" with your separator:

import re

line = re.sub(" *\| *", "[separator]", line)

After that you can just write the line to your output file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM