Python unicode print formatting

Question

I have a simple Python (2.7) script that reads a database table and spits out rows. Initially there was no need to use unicode, and the script was just this:

users = config.Session.query(User).order_by(User.id).all()
for _f in users:
    print "{0:6d}   {1:20}  {2:30}    {3:}".format(_f.id, _f.foo, _f.name, _f.url)

This worked fine and produced neatly formatted output like this:

   739   42352                 Foo Bar                           https://...
   740   23555                 Another User                      https://...
   741   774577                Third User                        https://...

Then we started having accented names in the database. Initially the script started raising an exception about ascii codec not happy with things.

I attempted to fix the script, which I did, sort of. I got rid of the exception, but now every accented character in the name seems to count as double, causing the URL field to be N characters off, N being the number of accented characters in the name.

for _f in users:
    uname = _f.name.encode('utf-8')
    print "{0:6d}   {1:20}  {2:30}    {3:}".format(_f.id, _f.foo, uname, _f.url)

And the output is now this:

   739   42352                 Foo Bar                           https://...
   740   23555                 Änöther User                    https://...
   741   774577                Third User                        https://...

What do I need to add into my formatting string to make it count the length of an unicode string with accented characters correctly?

Answer 1

Printing byte strings with a multi-byte UTF-8 encoding is the issue. Don't encode it, use Unicode strings, eg print u"{0:6d}..." .

Example:

print "1234567890"*3
print "{0:20}  xxx".format(u"Another User")
print "{0:20}  xxx".format(u"Änöther User".encode('utf8'))
print u"{0:20}  xxx".format(u"Änöther User")

Output:

123456789012345678901234567890
Another User          xxx
Änöther User        xxx
Änöther User          xxx

Python unicode print formatting

Question

1 answers

solution1
2 ACCPTED 2017-09-19 16:09:31

Python unicode print formatting

Question

1 answers

solution1 2 ACCPTED 2017-09-19 16:09:31

solution1
2 ACCPTED 2017-09-19 16:09:31