简体   繁体   中英

While editing a line of text in python, how do i avoid editing a specific part of text in quotation marks?

I have a file (an Apache log file) I need to convert to csv. So the spaces have to be replaced by commas. But one of the columns has fields which have spaces in between them. But that particular field is enclosed in quotation marks. I don't want to remove the spaces which are in the text between quotation marks. How do I go about it?

Example of line in the log:

127.0.0.1 - - [17/Aug/2018:12:57:39 +0530] "GET /mysoft-webappp/app/getNotifications?number=5&_=1534489899492&_hkstd=52bf9c52845cecc32af837db8f8e7385c71b229f67f4ef7c42e9ed5c3c14bMTUzNDQ5MDg1OTYzNg== HTTP/1.1" 200 46 ECC40515BD09C8C2FE6FB9ECCFFB40 127.0.0.1

You can use pandas to read it in, it handles these cases automatically (and you could even tweak its import behaviour further manually):

import pandas as pd
df = pd.read_table('/wherever/file/may/roam/yourfile.txt', sep=' ')
df.to_csv('/wherever/file/shall/roam/yourfile.csv')

sep=' ' defines a single space as the separator in your source file
df.to_csv saves the target file as csv, by default with commas as separator and without additional quotation marks

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM