I have a file (an Apache log file) I need to convert to csv. So the spaces have to be replaced by commas. But one of the columns has fields which have spaces in between them. But that particular field is enclosed in quotation marks. I don't want to remove the spaces which are in the text between quotation marks. How do I go about it?
Example of line in the log:
127.0.0.1 - - [17/Aug/2018:12:57:39 +0530] "GET /mysoft-webappp/app/getNotifications?number=5&_=1534489899492&_hkstd=52bf9c52845cecc32af837db8f8e7385c71b229f67f4ef7c42e9ed5c3c14bMTUzNDQ5MDg1OTYzNg== HTTP/1.1" 200 46 ECC40515BD09C8C2FE6FB9ECCFFB40 127.0.0.1
You can use pandas
to read it in, it handles these cases automatically (and you could even tweak its import behaviour further manually):
import pandas as pd
df = pd.read_table('/wherever/file/may/roam/yourfile.txt', sep=' ')
df.to_csv('/wherever/file/shall/roam/yourfile.csv')
sep=' '
defines a single space as the separator in your source file df.to_csv
saves the target file as csv, by default with commas as separator and without additional quotation marks
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.