[英]How can I get the list of names used in a formatting string?
Given a formatting string: 给定格式化字符串:
x = "hello %(foo)s there %(bar)s"
Is there a way to get the names of the formatting variables? 有没有办法获取格式变量的名称? (Without directly parsing them myself). (不自己直接解析它们)。
Using a Regex wouldn't be too tough but I was wondering if there was a more direct way to get these. 使用正则表达式不会太难,但我想知道是否有更直接的方法来获得这些。
Use a dict
subclass with overridden __missing__
method and from there you can collect all the missing format variables: 使用带有重写__missing__
方法的dict
子类,然后从中可以收集所有丢失的格式变量:
class StringFormatVarsCollector(dict):
def __init__(self, *args, **kwargs):
self.format_vars = []
def __missing__(self, k):
self.format_vars.append(k)
...
def get_format_vars(s):
d = StringFormatVarsCollector()
s % d
return d.format_vars
...
>>> get_format_vars("hello %(foo)s there %(bar)s")
['foo', 'bar']
If you don't want to parse the string, you can use this little function: 如果您不想解析字符串,可以使用这个小函数:
def find_format_vars(string):
vars= {}
while True:
try:
string%vars
break
except KeyError as e:
vars[e.message]= ''
return vars.keys()
>>> print find_format_vars("hello %(foo)s there %(bar)s") ['foo', 'bar']
The format fields are only significant to the %
operator, not the string itself. 格式字段仅对%
运算符有意义,而不是字符串本身。 So, there is no attribute like str.__format_fields__
which you can access in order to get the field names. 因此,没有像str.__format_fields__
这样的属性,您可以访问这些属性以获取字段名称。
I'd say that using Regex is actually the correct approach in this case. 我会说在这种情况下使用正则表达式实际上是正确的方法。 You can easily use re.findall
to extract the names: 您可以轻松使用re.findall
来提取名称:
>>> import re
>>> x = "hello %(foo)s there %(bar)s"
>>> re.findall('(?<!%)%\(([^)]+)\)[diouxXeEfFgGcrs]', x)
['foo', 'bar']
>>>
Below is an explanation of the pattern: 以下是该模式的解释:
(?<!%) # Negated look-behind to make sure that we do not match %%
% # Matches %
\( # Matches (
( # Starts a capture group
[^)]+ # Matches one or more characters that are not )
) # Closes the capture group
\) # Matches )
[diouxXeEfFgGcrs] # Matches one of the characters in the square brackets
New style string formatting has this ability. 新样式字符串格式具有此功能。
from string import Formatter
f = Formatter()
x = "hello {foo}s there {bar}s"
parsed = f.parse(x)
The results of parsed will be an iterable of tuples with this format: 解析的结果将是具有以下格式的元组的可迭代:
(literal_text, field_name, format_spec, conversion) (literal_text,field_name,format_spec,转换)
So it's simple enough to pull out the field_name section of the tuple: 所以它很简单,可以拉出元组的field_name部分:
field_names = [tup[1] for tup in parsed]
Here's the documentation if you would like more in-depth information https://docs.python.org/2/library/string.html#string.Formatter 如果您想要更深入的信息, 请参阅以下文档:https://docs.python.org/2/library/string.html#string.Formatter
Single list-comprehension version: 单列表理解版本:
[tup[1] for tup in "hello {foo}s there {bar}s"._formatter_parser()]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.