简体   繁体   English

如何使用python正确添加引号到字符串?

[英]How to properly add quotes to a string using python?

I want to add a set of (double) quotes to a python string if they are missing but the string can also contain quotes. 我想在python字符串中添加一组(双)引号,如果它们丢失但字符串也可以包含引号。

The purpose of this is to quote all command that are not already quoted because Windows API requires you to quote the entire command line when you execute a process using _popen() . 这样做的目的是引用所有尚未引用的命令,因为Windows API要求您在使用_popen()执行进程时引用整个命令行。

Here are some strings that should be quoted: 以下是一些应该引用的字符串:

<empty string>
type
"type" /?
"type" "/?"
type "a a" b
type "" b

Here are some that should not be quoted: 这里有一些应该被引用:

"type"
""type" /?"

Please take the time to test all examples ; 请花时间测试所有例子 ; it is not too easy to detect if the string needs the quotes or not. 检测字符串是否需要引号并不容易。

Your problem is inconsistent. 你的问题不一致。

Consider the two cases 考虑这两种情况

""a" b"

"a" "b"

The former is interpreted as a pre-quoted string with 'nested quotes', but the latter is interpreted as separately-quoted strings. 前者被解释为带有'嵌套引号'的预引用字符串,但后者被解释为单独引用的字符串。 Here are some examples that highlight the issue. 以下是一些突出问题的示例。

" "a" "b" "

" "a" b"

"a ""b"

How should they be treated? 他们该如何对待?

I think this is a difficult question to specify in a precise way, but perhaps this strategy will approximate your goal. 我认为这是一个难以明确指出的问题,但也许这个策略会接近你的目标。

The basic idea is to create a copy of the original string, removing the internally quoted items . 基本思想是创建原始字符串的副本,删除内部引用的项目 An internally quoted item is defined here so that it must contains at least one non-whitespace character. 此处定义了一个内部引用的项,以便它必须包含至少一个非空白字符。

After the internally quoted items have been removed, you then check whether the entire string needs surrounding quotes or not. 删除内部引用的项后,然后检查整个字符串是否需要周围的引号。

import re

tests = [
    # Test data in original question.
    ( '',                '""'                ),
    ( 'a',               '"a"'               ),
    ( '"a"',             '"a"'               ), # No change.
    ( '""a" b"',         '""a" b"'           ), # No change.
    ( '"a" b',           '""a" b"'           ),
    ( '"a" "b"',         '""a" "b""'         ),
    ( 'a "b" c',         '"a "b" c"'         ),

    # Test data in latest edits.
    ( 'type',            '"type"'         ),    # Quote these.
    ( '"type" /?',       '""type" /?"'    ),
    ( '"type" "/?"',     '""type" "/?""'  ),
    ( 'type "a a" b',    '"type "a a" b"' ),
    ( 'type "" b',       '"type "" b"'    ),
    ( '"type"',          '"type"'         ),    # Don't quote.
    ( '""type" /?"',     '""type" /?"'    ),

    # Some more tests.
    ( '"a b" "c d"',     '""a b" "c d""'     ),
    ( '" a " foo " b "', '"" a " foo " b ""' ),
]

Q = '"'
re_quoted_items = re.compile(r'" \s* [^"\s] [^"]* \"', re.VERBOSE)

for orig, expected in tests:
    # The orig string w/o the internally quoted items.
    woqi = re_quoted_items.sub('', orig)

    if len(orig) == 0:
        orig_quoted = Q + orig + Q
    elif len(woqi) > 0 and not (woqi[0] == Q and woqi[-1] == Q):
        orig_quoted = Q + orig + Q    
    else:
        orig_quoted = orig

    print orig_quoted == expected

I wrote a simple state machine to track if we are in a word or not. 我写了一个简单的状态机来跟踪我们是否在一个单词中。 If the quote depth is ever zero in the string, then we need quotes: 如果字符串中的引用深度为零,那么我们需要引号:

def quotify(s):
    if s == "":
        return '""'

    depth = 0
    in_word = False
    needs_quotes = False
    for c in s:
        if c == '"':
            if in_word:
                depth -= 1
            else:
                depth += 1
        else:
            if depth == 0:
                needs_quotes = True
                break
            in_word = not c.isspace()

    if needs_quotes:
        return '"' + s + '"'
    else:
        return s

assert quotify('') == '""'
assert quotify('''type''') == '''"type"'''
assert quotify('''"type" /?''') == '''""type" /?"'''
assert quotify('''"type" "/?"''') == '''""type" "/?""'''
assert quotify('''type "a a" b''') == '''"type "a a" b"'''
assert quotify('''type "" b''') == '''"type "" b"'''
assert quotify('''"type"''') == '''"type"'''
assert quotify('''""type" /?"''') == '''""type" /?"'''

You have three cases: 你有三种情况:

  1. String is less than two characters long: add quotes 字符串长度少于两个字符:添加引号
  2. String has quotes at s[0] and at s[1]: don't add quotes 字符串在s [0]和s [1]处有引号:不添加引号
  3. Add quotes 添加引号

And by "add quotes" I mean simply construct '"'+string+'"' and return it. 通过“添加引号”,我的意思是简单地构造'“'+ string +'”'并返回它。

Translate to if-statements, and you're done. 转换为if语句,你就完成了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM