如何获取格式化字符串中使用的名称列表？

Question

Given a formatting string: 给定格式化字符串：

x = "hello %(foo)s  there %(bar)s"

Is there a way to get the names of the formatting variables? 有没有办法获取格式变量的名称？ (Without directly parsing them myself). （不自己直接解析它们）。

Using a Regex wouldn't be too tough but I was wondering if there was a more direct way to get these. 使用正则表达式不会太难，但我想知道是否有更直接的方法来获得这些。

Answer 1

Use a dict subclass with overridden __missing__ method and from there you can collect all the missing format variables: 使用带有重写__missing__方法的dict子类，然后从中可以收集所有丢失的格式变量：

class StringFormatVarsCollector(dict):
    def __init__(self, *args, **kwargs):
        self.format_vars = []

    def __missing__(self, k):
        self.format_vars.append(k)
...         
def get_format_vars(s):
    d = StringFormatVarsCollector()     
    s % d                    
    return d.format_vars
... 
>>> get_format_vars("hello %(foo)s  there %(bar)s")
['foo', 'bar']

Answer 2

If you don't want to parse the string, you can use this little function: 如果您不想解析字符串，可以使用这个小函数：

def find_format_vars(string):
    vars= {}
    while True:
        try:
            string%vars
            break
        except KeyError as e:
            vars[e.message]= ''
    return vars.keys()

>>> print find_format_vars("hello %(foo)s there %(bar)s") ['foo', 'bar']

Answer 3

The format fields are only significant to the % operator, not the string itself. 格式字段仅对%运算符有意义，而不是字符串本身。 So, there is no attribute like str.__format_fields__ which you can access in order to get the field names. 因此，没有像str.__format_fields__这样的属性，您可以访问这些属性以获取字段名称。

I'd say that using Regex is actually the correct approach in this case. 我会说在这种情况下使用正则表达式实际上是正确的方法。 You can easily use re.findall to extract the names: 您可以轻松使用re.findall来提取名称：

>>> import re
>>> x = "hello %(foo)s  there %(bar)s"
>>> re.findall('(?<!%)%\(([^)]+)\)[diouxXeEfFgGcrs]', x)
['foo', 'bar']
>>>

Below is an explanation of the pattern: 以下是该模式的解释：

(?<!%)             # Negated look-behind to make sure that we do not match %% 
%                  # Matches %
\(                 # Matches (
(                  # Starts a capture group
[^)]+              # Matches one or more characters that are not )
)                  # Closes the capture group
\)                 # Matches )
[diouxXeEfFgGcrs]  # Matches one of the characters in the square brackets

Answer 4

New style string formatting has this ability. 新样式字符串格式具有此功能。

from string import Formatter

f = Formatter()
x = "hello {foo}s  there {bar}s"
parsed = f.parse(x)

The results of parsed will be an iterable of tuples with this format: 解析的结果将是具有以下格式的元组的可迭代：
(literal_text, field_name, format_spec, conversion) （literal_text，field_name，format_spec，转换）

So it's simple enough to pull out the field_name section of the tuple: 所以它很简单，可以拉出元组的field_name部分：

field_names = [tup[1] for tup in parsed]

Here's the documentation if you would like more in-depth information https://docs.python.org/2/library/string.html#string.Formatter 如果您想要更深入的信息，请参阅以下文档：https：//docs.python.org/2/library/string.html#string.Formatter

Single list-comprehension version: 单列表理解版本：

[tup[1] for tup in "hello {foo}s  there {bar}s"._formatter_parser()]

如何获取格式化字符串中使用的名称列表？

问题描述

4 个解决方案

解决方案1
7 已采纳 2014-12-31 18:08:55

解决方案2
5 2014-12-31 17:53:45

解决方案3
3 2014-12-31 17:44:32

解决方案4
3 2014-12-31 18:05:51

如何获取格式化字符串中使用的名称列表？

问题描述

4 个解决方案

解决方案1 7 已采纳 2014-12-31 18:08:55

解决方案2 5 2014-12-31 17:53:45

解决方案3 3 2014-12-31 17:44:32

解决方案4 3 2014-12-31 18:05:51

解决方案1
7 已采纳 2014-12-31 18:08:55

解决方案2
5 2014-12-31 17:53:45

解决方案3
3 2014-12-31 17:44:32

解决方案4
3 2014-12-31 18:05:51