简体   繁体   English

在python中编辑一行文本时,如何避免在引号中编辑文本的特定部分?

[英]While editing a line of text in python, how do i avoid editing a specific part of text in quotation marks?

I have a file (an Apache log file) I need to convert to csv.我有一个文件(一个 Apache 日志文件)需要转换为 csv。 So the spaces have to be replaced by commas.所以空格必须用逗号代替。 But one of the columns has fields which have spaces in between them.但是其中一列的字段之间有空格。 But that particular field is enclosed in quotation marks.但该特定字段用引号括起来。 I don't want to remove the spaces which are in the text between quotation marks.我不想删除引号之间的文本中的空格。 How do I go about it?我该怎么做?

Example of line in the log:日志中的行示例:

127.0.0.1 - - [17/Aug/2018:12:57:39 +0530] "GET /mysoft-webappp/app/getNotifications?number=5&_=1534489899492&_hkstd=52bf9c52845cecc32af837db8f8e7385c71b229f67f4ef7c42e9ed5c3c14bMTUzNDQ5MDg1OTYzNg== HTTP/1.1" 200 46 ECC40515BD09C8C2FE6FB9ECCFFB40 127.0.0.1 127.0.0.1 - - [17/8/2018:12:57:39 0530] “GET / mysoft-webappp /应用/ getNotifications数= 5&_ = 1534489899492&_hkstd = 52bf9c52845cecc32af837db8f8e7385c71b229f67f4ef7c42e9ed5c3c14bMTUzNDQ5MDg1OTYzNg == HTTP / 1.1?” 200 46 ECC40515BD09C8C2FE6FB9ECCFFB40 127.0.0.1

You can use pandas to read it in, it handles these cases automatically (and you could even tweak its import behaviour further manually):您可以使用pandas读取它,它会自动处理这些情况(您甚至可以进一步手动调整其导入行为):

import pandas as pd
df = pd.read_table('/wherever/file/may/roam/yourfile.txt', sep=' ')
df.to_csv('/wherever/file/shall/roam/yourfile.csv')

sep=' ' defines a single space as the separator in your source file sep=' '将单个空格定义为源文件中的分隔符
df.to_csv saves the target file as csv, by default with commas as separator and without additional quotation marks df.to_csv将目标文件保存为 csv,默认以逗号为分隔符,不加引号

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM