简体   繁体   中英

How to format python string based on byte length?

I have a problem with formatting my python output when the output contains non-ascii characters. Take the following example:

>>> persons = [['Anton',12], ['Jürgen',16], ['Bernd', 18]]
>>> for i in persons:
...     print '{0:10} {1:3}'.format(i[0], i[1])
...
Anton       12
Jürgen     16
Bernd       18

Naturally, I want the output to be perfectly aligned for the second argument, ie,

Anton       12
Jürgen      16
Bernd       18

How can I achieve my desired output using the .format() method?

I suspect that my problem has something to do with the way in which the length of strings is computed, ie, character length vs. byte length,

>>> len('Jürgen'.decode('utf-8'))
6
>>> len('Jürgen')
7 

but I could not find out how to specify the correct string format in this case.

As I type the question here on Stack Overflow, I can even visually see that the string 'Anton' has a different color than 'Jürgen', meaning that the latter may not be recognized as a 'normal' string, but what should I do?

尝试设置您的列表,例如:

persons = [['Anton',12], [u'Jürgen',16], ['Bernd', 18]]

Decode the strings using UTF-8 and format as Unicode:

>>> persons = [['Anton',12], ['Jürgen',16], ['Bernd', 18]]
>>> for i in persons:
...     print u'{0:10} {1:3}'.format(i[0].decode('utf-8'), i[1])
... 
Anton       12
Jürgen      16
Bernd       18

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM