简体   繁体   中英

Why do I get the u“xyz” format when I print a list of unicode strings in Python?

Please observe the following behavior:

a = u"foo"
b = u"b\xe1r"   # \xe1 is an 'a' with an accent
s = [a, b]

print a, b
print s
for x in s: print x,

The result is:

foo bár
[u'foo', u'b\xe1r']
foo bár

When I just print the two values sitting in variables a and b , I get what I expect; when I put the string values in a list and print it, I get the unwanted u"xyz" form; finally, when I print values from the list with a loop, I get the first form again. Can someone please explain this seemingly odd behavior? I know there's probably a good reason.

When you print a list, you get the repr() of each element, lists aren't really meant to be printed, so python tries to print something representative of it's structure.

If you want to format it in any particular way, either be explicit about how you want it formatted, or override it's __repr__ method.

Objects in Python have two ways to be turned into strings: roughly speaking, str() produces human readable output, and repr() produces computer-readable output. When you print something, it uses str().

But the str() of a list uses the repr() of its elements.

You get this because lists can contain any number of elements, of mixed types. In the second case, instead of printing unicode strings, you're printing the list itself - which is very different than printing the list contents.

Since the list can contain anything, you get the u'foo' syntax. If you were using non-unicode strings, you'd see the 'foo' instead of just foo , as well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM