简体   繁体   English

在python中使用csv.reader时如何使用多个定界符?

[英]How to use multiple delimiters when using csv.reader in python?

I have multiple lines of texts in a text file that look similar to this: 我在文本文件中有多行文本,看起来像这样:

2012-03-16 13:47:30.465 -0400   START  Running    Lab.script    19    on_the

I want to be able to convert this text file into csv. 我希望能够将此文本文件转换为csv。 I've already done that using this code: 我已经使用以下代码完成了此操作:

fin = csv.reader(open('LogFile.txt', 'rb'), delimiter='\t')
fout = open('newLogFile.csv', 'w')
for row in fin:
   fout.write(','.join(row) + '\n')

But now, my issue is that I need to be able to add a "," after the spaces in this part of the line: 但是现在,我的问题是我需要能够在该行这一部分的空格后面添加一个“,”:

2012-03-16 13:47:30.465 -0400 

I'm not sure how to do it, I've tried using split(), to split the current row/position but it didn't work. 我不确定该怎么做,我试过使用split()来拆分当前行/位置,但是没有用。 Any suggestions would be very helpful. 任何建议将非常有帮助。

Thank you 谢谢

Would helpful to instead just tab delimit everything from the beginning? 相反,仅从头开始用Tab键分隔所有内容是否有帮助? If so you can refer to this answer , essentially 如果是这样,您基本上可以参考此答案

There is a special-case shortcut for exactly this use case! 正是这种用例有一个特殊情况的快捷方式!

If you call str.split without an argument, it splits on runs of whitespace instead of single characters. 如果在不带参数的情况下调用str.split,则它将在运行空白而不是单个字符时进行拆分。 So: 所以:

 >>> ' '.join("Please \\n don't \\t hurt \\x0b me.".split()) "Please don't hurt me." 

so for you it would be 所以对你来说

newLogFile = open('newLogFile.csv', 'w')
textFile = open('LogFile.txt', 'rb')
for row in textFile:
    newLogFile.write('\t'.join(row.split()))

Also you said 你也说过

But now, my issue is that I need to be able to add a "," after the spaces in this part of the line: 但是现在,我的问题是我需要能够在该行这一部分的空格后面添加一个“,”:

2012-03-16 13:47:30.465 -0400 2012-03-16 13:47:30.465 -0400

to me that sounds like you want 对我来说听起来像你想要的

2012-03-16 ,13:47:30.465 ,-0400

Try the following: 请尝试以下操作:

fin = csv.reader(open('LogFile.txt', 'rb'), delimiter='\t')
fout = open('newLogFile.csv', 'w')
for row in fin:
   row[0] = ','.join(row[0].split())
   fout.write(','.join(row) + '\n')

This will take a row that looks like this after being read in by csv.reader() : 在被csv.reader()读入后,将需要一行这样的内容:

['2012-03-16 13:47:30.465 -0400', 'START', 'Running', 'Lab.script', '19 ', 'on_the']

And then change the first element so that it looks like this: 然后更改第一个元素,使其如下所示:

['2012-03-16,13:47:30.465,-0400', 'START', 'Running', 'Lab.script', '19 ', 'on_the']

And after ','.join() on the row you get the line that will be written to your output file: 在行上的','.join()之后','.join()您将获得将被写入输出文件的行:

'2012-03-16,13:47:30.465,-0400,START,Running,Lab.script,19,on_the'

If there are other elements that may have spaces in them and you want to treat them all as a delimiter in your output csv, you can do the following: 如果还有其他元素可能在其中包含空格,并且您想将它们全部视为输出csv中的定界符,则可以执行以下操作:

fin = csv.reader(open('LogFile.txt', 'rb'), delimiter='\t')
fout = open('newLogFile.csv', 'w')
for row in fin:
   fout.write(','.join(','.join(item.split()) for item in row) + '\n')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM