简体   繁体   English

字符串格式选项:优点和缺点

[英]String formatting options: pros and cons

These are two very popular ways of formatting a string in Python. 这是在Python中格式化字符串的两种非常流行的方法。 One is using a dict : 一个是使用dict

>>> 'I will be %(years)i on %(month)s %(day)i' % {'years': 21, 'month': 'January', 'day': 23}
'I will be 21 on January 23'

And the other one using a simple tuple : 而另一个使用简单的tuple

>>> 'I will be %i on %s %i' % (21, 'January', 23)
'I will be 21 on January 23'

The first one is way more readable, but the second one is faster to write. 第一个是更具可读性,但第二个更快写。 I actually use them indistinctly. 我实际上模糊地使用它们。

What are the pros and cons of each one? 每个人的利弊是什么? regarding performance, readability, code optimization (is one of them transformed to the other?) and anything else you would think is useful to share. 关于性能,可读性,代码优化(其中一个转换为另一个?)以及您认为有用的任何其他内容。

Why format() is more flexible than % string operations 为什么format()% string操作更灵活

I think you should really stick to format() method of str , because it is the preferred way to format strings and will probably replace string formatting operation in the future. 我认为你应该坚持str format()方法,因为它是格式化字符串的首选方式,并可能在将来取代字符串格式化操作。

Furthermore, it has some really good features, that can also combine position-based formatting with keyword-based one : 此外,它还有一些非常好的功能,它们还可以将基于位置的格式与基于关键字的格式相结合

>>> string = 'I will be {} years and {} months on {month} {day}'
>>> some_date = {'month': 'January', 'day': '1st'}
>>> diff = [3, 11] # years, months
>>> string.format(*diff, **some_date)
'I will be 3 years and 11 months on January 1st'

even the following will work: 即便是以下工作:

>>> string = 'On {month} {day} it will be {1} months, {0} years'
>>> string.format(*diff, **some_date)
'On January 1st it will be 11 months, 3 years'

There is also one other reason in favor of format() . 还有一个原因支持format() Because it is a method, it can be passed as a callback like in the following example: 因为它是一个方法,所以它可以作为回调传递,如下例所示:

>>> data = [(1, 2), ('a', 'b'), (5, 'ABC')]
>>> formatter = 'First is "{0[0]}", then comes "{0[1]}"'.format
>>> for item in map(formatter, data):
    print item


First is "1", then comes "2"
First is "a", then comes "b"
First is "5", then comes "ABC"

Isn't it a lot more flexible than string formatting operation? 它不是比字符串格式化操作更灵活吗?

See more examples on documentation page for comparison between % operations and .format() method. 有关% operations和.format()方法之间的比较,请参阅文档页面上的更多示例。

Comparing tuple-based % string formatting with dictionary-based 比较基于元组的%字符串格式与基于字典的格式

Generally there are three ways of invoking % string operations (yes, three , not two ) like that: 通常有三种方法可以调用%字符串操作 (是, ,而不是两个 ):

base_string % values

and they differ by the type of values (which is a consequence of what is the content of base_string ): 它们因values的类型而不同(这是base_string的内容的结果):

  • it can be a tuple , then they are replaced one by one, in the order they are appearing in tuple, 它可以是一个tuple ,然后按照它们出现在元组中的顺序逐个替换它们,

     >>> 'Three first values are: %f, %f and %f' % (3.14, 2.71, 1) 'Three first values are: 3.140000, 2.710000 and 1.000000' 
  • it can be a dict (dictionary), then they are replaced based on the keywords, 它可以是一个dict (字典),然后根据关键字替换它们,

     >>> 'My name is %(name)s, I am %(age)s years old' % {'name':'John','age':98} 'My name is John, I am 98 years old' 
  • it can be a single value, if the base_string contains single place where the value should be inserted: 它可以是单个值,如果base_string包含应插入值的单个位置:

     >>> 'This is a string: %s' % 'abc' 'This is a string: abc' 

There are obvious differences between them and these ways cannot be combined (in contrary to format() method which is able to combine some features, as mentioned above). 它们之间存在明显的差异,并且这些方式不能组合(与能够组合某些特征的format()方法相反,如上所述)。

But there is something that is specific only to dictionary-based string formatting operation and is rather unavailable in remaining three formatting operations' types. 但是有一些东西仅特定于基于字典的字符串格式化操作,并且在剩余的三种格式化操作类型中相当不可用。 This is ability to replace specificators with actual variable names in a simple manner : 这是以简单的方式用实际变量名替换特定程序的能力

>>> name = 'John'
>>> surname = 'Smith'
>>> age = 87
# some code goes here
>>> 'My name is %(surname)s, %(name)s %(surname)s. I am %(age)i.' % locals()
'My name is Smith, John Smith. I am 87.'

Just for the record: of course the above could be easily replaced by using format() by unpacking the dictionary like that: 仅供记录:当然,通过使用format()解压缩字典可以很容易地取代上面的内容:

>>> 'My name is {surname}, {name} {surname}. I am {age}.'.format(**locals())
'My name is Smith, John Smith. I am 87.'

Does anyone else have an idea what could be a feature specific to one type of string formatting operation, but not to the other? 有没有其他人知道什么是特定于一种类型的字符串格式化操作的功能,而不是另一种? It could be quite interesting to hear about it. 听到它可能会很有趣。

I'm not exactly answering your question, but just thought it'd be nice to throw format into your mix. 我并没有完全回答你的问题,但只是认为把format放到你的混音中会很好。

I personally prefer the syntax of format to both: 我个人更喜欢format的语法:

'I will be {years} on {month} {day}'.format(years=19, month='January', day=23)

If I want something compact, I just write: 如果我想要一些紧凑的东西,我只想写:

'I will be {} on {} {}'.format(19, 'January', 23)

And format plays nicely with objects: format与对象很好地匹配:

class Birthday:
  def __init__(self, age, month, day):
    self.age = age
    self.month = month
    self.day = day

print 'I will be {b.age} on {b.month} {b.day}'.format(b = Birthday(19, 'January', 23))

I am not answering the question but just explaining the idea I came up in my TIScript . 我没有回答这个问题,只是解释了我在TIScript中提出的想法。

I've introduced so called "stringizer" functions: any function with name starting from '$' is a stringizer. 我已经引入了所谓的“字符串化”函数:名称从'$'开始的任何函数都是字符串化器。 Compiler treats '$name(' and ')' as quotes of string literal combined with function call. 编译器将'$ name('和')'视为字符串文字与函数调用相结合的引号。

Example, this: 例如,这个:

$print(I will be {b.age} on {b.month} {b.day});

is actually compiled into 实际上编译成

$print("I will be ", b.age, " on ",b.month," ",b.day);

where even arguments are always literal strings and odd ones are expressions. 其中偶数参数总是文字字符串而奇数字符串是表达式。 This way it is possible to define custom stringizers that use different formatting/argument processing. 这样就可以定义使用不同格式/参数处理的自定义字符串化器。

For example Element.$html(Hello <b>{who}</b>); 例如Element.$html(Hello <b>{who}</b>); will apply HTML escape on expressions. 将在表达式上应用HTML转义。 And this Element.$(option[value={12}]); 而这个Element.$(option[value={12}]); will do select in jQuery style. 将以jQuery样式进行选择。

Pretty convenient and flexible. 非常方便灵活。

I am not sure is it possible to do something like this in Python without changing its compiler. 我不确定是否可以在Python中执行类似的操作而不更改其编译器。 Consider just as an idea. 考虑一下这个想法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM