简体   繁体   English

CSV分隔符无法正常使用[Python]

[英]CSV delimiter doesn't work properly [Python]

import csv

base='eest1@mail.ru,username1\
test2@gmail.com,username2\
test3@gmail.com,username3\
test4@rambler.ru,username4\
test5@ya.ru,username5'

parsed=csv.reader(base, delimiter=',')
for p in parsed:
    print p

Returns: 返回值:

['e']
['e']
['s']
['t']
['1']
['@']
['m']
['a']
['i']
['l']
['.']
['r']
['u']
['', ''] 

etc... 等等...

How I can get data separated by comma ? 如何获取用逗号分隔的数据? ('test1@gmail.com', 'username1'), ('test2@gmail.com', 'username2'), ... ('test1@gmail.com','username1'),('test2@gmail.com','username2'),...

I think csv only works with file like objects. 我认为csv仅适用于类似对象的文件。 You can use StringIO in this case. 在这种情况下,您可以使用StringIO。

import csv
import StringIO

base='''eest1@mail.ru,username
test2@gmail.com,username2
test3@gmail.com,username3
test4@rambler.ru,username4
test5@ya.ru,username5'''

parsed=csv.reader(StringIO.StringIO(base), delimiter=',')
for p in parsed:
    print p

OUTPUT 输出值

['eest1@mail.ru', 'username']
['test2@gmail.com', 'username2']
['test3@gmail.com', 'username3']
['test4@rambler.ru', 'username4']
['test5@ya.ru', 'username5']

Also, your example string does not have newlines, so you would get 另外,您的示例字符串没有换行符,因此您将获得

['eest1@mail.ru', 'usernametest2@gmail.com', 'username2test3@gmail.com', 'username3test4@rambler.ru', 'username4test5@ya.ru', 'username5']

You can use the ''' like I did, or change your base like 您可以使用'''像我一样,或者改变你的base

base='eest1@mail.ru,username\n\
test2@gmail.com,username2\n\
test3@gmail.com,username3\n\
test4@rambler.ru,username4\n\
test5@ya.ru,username5'

EDIT 编辑
According to the docs, the argument can be either a file-like objet OR a list. 根据文档,参数可以是类似文件的对象或列表。 So this works too 所以这也有效

parsed=csv.reader(base.splitlines(), delimiter=',')

Quoting official docs on csv module ( emphasis mine ): 在csv模块上引用官方文档重点是我的 ):

csv.reader(csvfile, dialect='excel', **fmtparams)

Return a reader object which will iterate over lines in the given csvfile . 返回一个读者对象,该对象将遍历给定csvfile csvfile can be any object which supports the iterator protocol and returns a string each time its __next__() method is called — file objects and list objects are both suitable. csvfile可以是任何支持迭代器协议并在每次调用其__next__()方法时都返回字符串的对象-文件对象和列表对象均适用。

Strings supports iterator, but it yields characters from string one by one, not lines from multi-line string. 字符串支持迭代器,但它会一一生成字符串中的字符 ,而不是多行字符串中的行。

>>> s = "abcdef"
>>> i = iter(s)
>>> next(i)
'a'
>>> next(i)
'b'
>>> next(i)
'c'

So the task is to create iterator, which would yield lines and not characters on each iterations. 因此,任务是创建迭代器,该迭代器将在每次迭代时生成而不是字符 Unfortunately, your string literal is not a multiline string. 不幸的是,您的字符串文字不是多行字符串。

base='eest1@mail.ru,username1\
test2@gmail.com,username2\
test3@gmail.com,username3\
test4@rambler.ru,username4\
test5@ya.ru,username5'

is equivalent to: 等效于:

base = 'eest1@mail.ru,username1test2@gmail.com,username2test3@gmail.com,username3test4@rambler.ru,username4test5@ya.ru,username5

Esentially you do not have information required to parse that string correctly. 基本上,您没有正确解析该字符串所需的信息。 Try using multiline string literal instead: 尝试改用多行字符串文字:

base='''eest1@mail.ru,username1
test2@gmail.com,username2
test3@gmail.com,username3
test4@rambler.ru,username4
test5@ya.ru,username5'''

After this change you may split your string by newlines characters and everything should work fine: 进行此更改后,您可以按换行符分隔字符串,并且一切正常:

parsed=csv.reader(base.splitlines(), delimiter=',')
for p in parsed:
    print(p)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM