简体   繁体   中英

CSV delimiter doesn't work properly [Python]

import csv

base='eest1@mail.ru,username1\
test2@gmail.com,username2\
test3@gmail.com,username3\
test4@rambler.ru,username4\
test5@ya.ru,username5'

parsed=csv.reader(base, delimiter=',')
for p in parsed:
    print p

Returns:

['e']
['e']
['s']
['t']
['1']
['@']
['m']
['a']
['i']
['l']
['.']
['r']
['u']
['', ''] 

etc...

How I can get data separated by comma ? ('test1@gmail.com', 'username1'), ('test2@gmail.com', 'username2'), ...

I think csv only works with file like objects. You can use StringIO in this case.

import csv
import StringIO

base='''eest1@mail.ru,username
test2@gmail.com,username2
test3@gmail.com,username3
test4@rambler.ru,username4
test5@ya.ru,username5'''

parsed=csv.reader(StringIO.StringIO(base), delimiter=',')
for p in parsed:
    print p

OUTPUT

['eest1@mail.ru', 'username']
['test2@gmail.com', 'username2']
['test3@gmail.com', 'username3']
['test4@rambler.ru', 'username4']
['test5@ya.ru', 'username5']

Also, your example string does not have newlines, so you would get

['eest1@mail.ru', 'usernametest2@gmail.com', 'username2test3@gmail.com', 'username3test4@rambler.ru', 'username4test5@ya.ru', 'username5']

You can use the ''' like I did, or change your base like

base='eest1@mail.ru,username\n\
test2@gmail.com,username2\n\
test3@gmail.com,username3\n\
test4@rambler.ru,username4\n\
test5@ya.ru,username5'

EDIT
According to the docs, the argument can be either a file-like objet OR a list. So this works too

parsed=csv.reader(base.splitlines(), delimiter=',')

Quoting official docs on csv module ( emphasis mine ):

csv.reader(csvfile, dialect='excel', **fmtparams)

Return a reader object which will iterate over lines in the given csvfile . csvfile can be any object which supports the iterator protocol and returns a string each time its __next__() method is called — file objects and list objects are both suitable.

Strings supports iterator, but it yields characters from string one by one, not lines from multi-line string.

>>> s = "abcdef"
>>> i = iter(s)
>>> next(i)
'a'
>>> next(i)
'b'
>>> next(i)
'c'

So the task is to create iterator, which would yield lines and not characters on each iterations. Unfortunately, your string literal is not a multiline string.

base='eest1@mail.ru,username1\
test2@gmail.com,username2\
test3@gmail.com,username3\
test4@rambler.ru,username4\
test5@ya.ru,username5'

is equivalent to:

base = 'eest1@mail.ru,username1test2@gmail.com,username2test3@gmail.com,username3test4@rambler.ru,username4test5@ya.ru,username5

Esentially you do not have information required to parse that string correctly. Try using multiline string literal instead:

base='''eest1@mail.ru,username1
test2@gmail.com,username2
test3@gmail.com,username3
test4@rambler.ru,username4
test5@ya.ru,username5'''

After this change you may split your string by newlines characters and everything should work fine:

parsed=csv.reader(base.splitlines(), delimiter=',')
for p in parsed:
    print(p)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM