Python正则表达式在字符串中的双引号中查找字符串

Question

I'm looking for a code in python using regex that can perform something like this我正在使用正则表达式在 python 中寻找可以执行类似操作的代码

Input: Regex should return "String 1" or "String 2" or "String3"输入： Regex should return "String 1" or "String 2" or "String3"

Output: String 1,String2,String3输出： String 1,String2,String3

I tried r'"*"'我试过r'"*"'

Answer 1

Here's all you need to do:这是您需要做的所有事情：

def doit(text):      
  import re
  matches = re.findall(r'"(.+?)"',text)
  # matches is now ['String 1', 'String 2', 'String3']
  return ",".join(matches)

doit('Regex should return "String 1" or "String 2" or "String3" ')

result:结果：

'String 1,String 2,String3'

As pointed out by Li-aung Yip :正如Li-aung Yip所指出的：

To elaborate, .+?详细地说， .+? is the "non-greedy" version of .+ .是.+的“非贪婪”版本。 It makes the regular expression match the smallest number of characters it can instead of the most characters it can.它使正则表达式匹配它可以匹配的最少字符数，而不是它可以匹配的最多字符数。 The greedy version, .+ , will give String 1" or "String 2" or "String 3 ;贪婪版本.+将给出String 1" or "String 2" or "String 3 ； the non-greedy version .+?非贪婪版本.+? gives String 1 , String 2 , String 3 .给出String 1 ， String 2 ， String 3 。

In addition, if you want to accept empty strings, change .+ to .* .此外，如果要接受空字符串，请将.+更改为.* 。 Star * means zero or more while plus + means at least one.星号*表示零或多个，而+表示至少一个。

Answer 2

The highly up-voted answer doesn't account for the possibility that the double-quoted string might contain one or more double-quote characters (properly escaped, of course).高度赞成的答案没有考虑双引号字符串可能包含一个或多个双引号字符的可能性（当然，正确转义）。 To handle this situation, the regex needs to accumulate characters one-by-one with a positive lookahead assertion stating that the current character is not a double-quote character that is not preceded by a backslash (which requires a negative lookbehind assertion ):为了处理这种情况，正则表达式需要一个一个地累积字符，并带有一个肯定的前瞻断言，说明当前字符不是前面没有反斜杠的双引号字符（这需要一个否定的后瞻断言）：

"(?:(?:(?!(?<!\\)").)*)"

See Regex Demo见正则表达式演示

import re
import ast


def doit(text):
    matches=re.findall(r'"(?:(?:(?!(?<!\\)").)*)"',text)
    for match in matches:
        print(match, '=>', ast.literal_eval(match))


doit('Regex should return "String 1" or "String 2" or "String3" and "\\"double quoted string\\"" ')

Prints:印刷：

"String 1" => String 1
"String 2" => String 2
"String3" => String3
"\"double quoted string\"" => "double quoted string"

Answer 3

Just try to fetch double quoted strings from the multiline string:只需尝试从多行字符串中获取双引号字符串：

import re

s = """ 
"my name is daniel"  "mobile 8531111453733"[[[[[[--"i like pandas"
"location chennai"! -asfas"aadhaar du2mmy8969769##69869" 
@4343453 "pincode 642002""@mango,@apple,@berry" 
"""
print(re.findall(r'"(.*?)"', s))

Answer 4

From https://stackoverflow.com/a/69891301/1531728来自https://stackoverflow.com/a/69891301/1531728

My solution is:我的解决方案是：

import re
my_strings = ['SetVariables "a" "b" "c" ', 'd2efw   f "first" +&%#$%"second",vwrfhir, d2e   u"third" dwedew', '"uno"?>P>MNUIHUH~!@#$%^&*()_+=0trewq"due"        "tre"fef    fre f', '       "uno""dos"      "tres"', '"unu""doua""trei"', '      "um"                    "dois"           "tres"                  ']
my_substrings = []
for current_test_string in my_strings:
    for values in re.findall(r'\"(.+?)\"', current_test_string):
        my_substrings.append(values)
        #print("values are:",values,"=")
    print(" my_substrings are:",my_substrings,"=")
    my_substrings = []

Alternate regular expressions to use are:要使用的备用正则表达式是：

re.findall('"(.+?)"', current_test_string) [Avinash2021] [user17405772021] re.findall('"(.+?)"', current_test_string) [Avinash2021] [user17405772021]
re.findall('"(.*?)"', current_test_string) [Shelvington2020] re.findall('"(.*?)"', current_test_string) [Shelvington2020]
re.findall(r'"(.*?)"', current_test_string) [Lundberg2012] [Avinash2021] re.findall(r'"(.*?)"', current_test_string) [Lundberg2012] [Avinash2021]
re.findall(r'"(.+?)"', current_test_string) [Lundberg2012] [Avinash2021] re.findall(r'"(.+?)"', current_test_string) [Lundberg2012] [Avinash2021]
re.findall(r'"["]', current_test_string) [Muthupandi2019] re.findall(r'"["]', current_test_string) [Muthupandi2019]
re.findall(r'"([^"]*)"', current_test_string) [Pieters2014] re.findall(r'"([^"]*)"', current_test_string) [Pieters2014]
re.findall(r'"(?:(?:(?!(?<!\)").)*)"', current_test_string) # Causes double quotes to remain in the strings, but can be removed via other means. [Booboo2020] re.findall(r'"(?:(?:(?!(?<!\)").)*)"', current_test_string) # 导致双引号保留在字符串中，但可以通过其他方式删除. [booboo2020]
re.findall(r'"(.*?)(?<!\)"', current_test_string) [Hassan2014] re.findall(r'"(.*?)(?<!\)"', current_test_string) [Hassan2014]
re.findall('"[^"]*"', current_test_string) # Causes double quotes to remain in the strings, but can be removed via other means. [Martelli2013] re.findall('"[^"]*"', current_test_string) # 使双引号保留在字符串中，但可以通过其他方式删除。[Martelli2013]
re.findall('"([^"]*)"', current_test_string) [jspcal2014] re.findall('"([^"]*)"', current_test_string) [jspcal2014]
re.findall("'(.*?)'", current_test_string) [akhilmd2016] re.findall("'(.*?)'", current_test_string) [akhilmd2016]

The current_test_string.split("\"") approach works if the strings have patterns in which substrings are embedded within quotation marks. This is because it uses the double quotation mark in this example as a delimiter to tokenize the string, and accepts substrings that are not embedded within double quotation marks as valid substring extractions from the string. current_test_string.split("\"")方法适用于字符串具有其中子字符串嵌入引号内的模式。这是因为它在此示例中使用双引号作为分隔符来标记字符串，并接受符合以下条件的子字符串不作为从字符串中提取的有效子字符串嵌入在双引号中。

References:参考：

[Avinash2021] Arvind Kumar Avinash, Answer to ``Extract text between quotation using regex python'', Stack Exchange, Inc., New York, NY, October 12, 2021. Available online from Stack Exchange Inc.: Stack Overflow: Questions at: https://stackoverflow.com/a/69543129/1531728 and Extract text between quotation using regex python November 8, 2021 was the last accessed date. [Avinash2021] Arvind Kumar Avinash，回答“使用 regex python 在引用之间提取文本”，Stack Exchange, Inc.，纽约，纽约，2021 年 10 月 12 日。可从 Stack Exchange Inc. 在线获取：Stack Overflow：问题位于： https ://stackoverflow.com/a/69543129/1531728 和使用正则表达式 python 在引用之间提取文本2021 年 11 月 8 日是最后访问日期。
[user17405772021] user1740577, Answer to ``Extract text between quotation using regex python'', Stack Exchange, Inc., New York, NY, October 12, 2021. Available online from Stack Exchange Inc.: Stack Overflow: Questions at: https://stackoverflow.com/a/69543030/1531728 and Extract text between quotation using regex python November 8, 2021 was the last accessed date. [user17405772021] user1740577，回答“使用 regex python 在报价之间提取文本”，Stack Exchange, Inc.，纽约，纽约，2021 年 10 月 12 日。可从 Stack Exchange Inc. 在线获取：Stack Overflow：问题位于： https ://stackoverflow.com/a/69543030/1531728和使用正则表达式 python 在引用之间提取文本2021 年 11 月 8 日是最后访问日期。
[Shelvington2020] Iain Shelvington, Answer to ``Extracting only words out of a mixed string in Python [duplicate]'', Stack Exchange, Inc., New York, NY, January 5, 2020. Available online from Stack Exchange Inc.: Stack Overflow: Questions at: https://stackoverflow.com/a/59598630/1531728 and Extracting only words out of a mixed string in Python November 6, 2021 was the last accessed date. [Shelvington2020] Iain Shelvington，“仅从 Python 中的混合字符串中提取单词 [重复]”的答案，Stack Exchange, Inc.，纽约，纽约，2020 年 1 月 5 日。可从 Stack Exchange Inc. 在线获取： Stack Overflow：问题位于： https ://stackoverflow.com/a/59598630/1531728 和从 Python 中的混合字符串中仅提取单词2021 年 11 月 6 日是最后访问日期。
[Lundberg2012] Johan Lundberg, Answer to ``Python Regex to find a string in double quotes within a string'', Stack Exchange, Inc., New York, NY, March 1, 2012. Available online from Stack Exchange Inc.: Stack Overflow: Questions at: https://stackoverflow.com/a/9519934/1531728 and Python Regex to find a string in double quotes within a string November 6, 2021 was the last accessed date. [Lundberg2012] Johan Lundberg，回答“Python Regex to find a string in double quotes within a string”，Stack Exchange, Inc.，纽约，纽约，2012 年 3 月 1 日。Stack Exchange Inc. 在线提供：Stack溢出：问题位于： https ://stackoverflow.com/a/9519934/1531728 和Python 正则表达式在字符串中查找双引号中的字符串2021 年 11 月 6 日是最后访问日期。
[Muthupandi2019] Daniel Muthupandi and trotta, Answer to ``Python Regex to find a string in double quotes within a string'', Stack Exchange, Inc., New York, NY, August 3, 2019. Available online from Stack Exchange Inc.: Stack Overflow: Questions at: https://stackoverflow.com/a/57337020/1531728 and Python Regex to find a string in double quotes within a string November 6, 2021 was the last accessed date. [Muthupandi2019] Daniel Muthupandi 和 trotta，回答“Python Regex to find a string in double quotes within a string”，Stack Exchange, Inc.，纽约，2019 年 8 月 3 日。可从 Stack Exchange Inc. 在线获取。：堆栈溢出：问题位于： https ://stackoverflow.com/a/57337020/1531728 和Python 正则表达式在字符串中查找双引号中的字符串2021 年 11 月 6 日是最后访问日期。
[Booboo2020] Booboo, Answer to ``Python Regex to find a string in double quotes within a string'', Stack Exchange, Inc., New York, NY, March 29, 2014. Available online from Stack Exchange Inc.: Stack Overflow: Questions at: https://stackoverflow.com/a/63707053/1531728 and Python Regex to find a string in double quotes within a string November 6, 2021 was the last accessed date. [Booboo2020] Booboo，回答“Python Regex to find a string in double quotes within a string”，Stack Exchange, Inc.，纽约，纽约，2014 年 3 月 29 日。可从 Stack Exchange Inc. 在线获取：Stack Overflow ：问题位于： https ://stackoverflow.com/a/63707053/1531728 和Python 正则表达式在字符串中查找双引号中的字符串2021 年 11 月 6 日是最后访问日期。
[Pieters2014] Martijn Pieters, Answer to ``Extract a string between double quotes'', Stack Exchange, Inc., New York, NY, March 29, 2014. Available online from Stack Exchange Inc.: Stack Overflow: Questions at: https://stackoverflow.com/a/22735466/1531728 and Extract a string between double quotes November 6, 2021 was the last accessed date. [Pieters2014] Martijn Pieters，“在双引号之间提取字符串”的答案，Stack Exchange, Inc.，纽约，纽约，2014 年 3 月 29 日。可从 Stack Exchange Inc. 在线获取：Stack Overflow：问题位于： https ://stackoverflow.com/a/22735466/1531728和提取双引号之间的字符串2021 年 11 月 6 日是最后访问日期。
[Hassan2014] Sabuj Hassan, Answer to ``Extract a string between double quotes'', Stack Exchange, Inc., New York, NY, March 29, 2014. Available online from Stack Exchange Inc.: Stack Overflow: Questions at: https://stackoverflow.com/a/22735480/1531728 and Extract a string between double quotes November 6, 2021 was the last accessed date. [Hassan2014] Sabuj Hassan，“在双引号之间提取字符串”的答案，Stack Exchange, Inc.，纽约，纽约，2014 年 3 月 29 日。可从 Stack Exchange Inc. 在线获取：Stack Overflow：问题位于： https ://stackoverflow.com/a/22735480/1531728和提取双引号之间的字符串2021 年 11 月 6 日是最后访问日期。
[Martelli2013] Alex Martelli and Sumit Singh, Answer to "Extract string from between quotations", Stack Exchange Inc., New York, NY, March 14, 2014. Available online from Stack Exchange Inc.: Stack Overflow: Questions at: https://stackoverflow.com/a/2076357/1531728 and Extract string from between quotations November 6, 2021 was the last accessed date. [Martelli2013] Alex Martelli 和 Sumit Singh，回答“从引号之间提取字符串”，Stack Exchange Inc.，纽约，纽约，2014 年 3 月 14 日。可从 Stack Exchange Inc. 在线获取：Stack Overflow：问题位于： https： //stackoverflow.com/a/2076357/1531728和从引号之间提取字符串2021 年 11 月 6 日是最后访问日期。
[jspcal2014] jspcal, Answer to "Extract string from between quotations", Stack Exchange Inc., New York, NY, March 14, 2014. Available online from Stack Exchange Inc.: Stack Overflow: Questions at: https://stackoverflow.com/a/2076356/1531728 and Extract string from between quotations November 6, 2021 was the last accessed date. [jspcal2014] jspcal，“从引号之间提取字符串”的答案，Stack Exchange Inc.，纽约，纽约，2014 年 3 月 14 日。可从 Stack Exchange Inc. 在线获取：Stack Overflow：问题位于： https://stackoverflow。 com/a/2076356/1531728和从引号之间提取字符串2021 年 11 月 6 日是最后访问日期。
[akhilmd2016] akhilmd, Answer to "Stripping string in python between quotes", Stack Exchange Inc., New York, NY, July 2, 2016. Available online from Stack Exchange Inc.: Stack Overflow: Questions at: https://stackoverflow.com/a/38161072/1531728 and ; [akhilmd2016] akhilmd，回答“在引号之间剥离 python 中的字符串”，Stack Exchange Inc.，纽约，纽约，2016 年 7 月 2 日。可从 Stack Exchange Inc. 在线获取：Stack Overflow：问题位于： https://stackoverflow .com/a/38161072/1531728和； November 5, 2021 was the last accessed date. 2021 年 11 月 5 日是最后访问日期。

Answer 5

For me the only regex that ever worked right for all the cases of quoted strings with possibly escaped quotes inside of them was:对我来说，唯一适用于所有带引号的字符串的正则表达式，其中可能包含转义引号：

regex=r"""(['"])(?:\\\\|\\\1|[^\1])*?\1"""

This will not fail even if the quoted string ends with an escaped backslash.即使引用的字符串以转义的反斜杠结尾，这也不会失败。

Answer 6

import re
r=r"'(\\'|[^'])*(?!<\\)'|\"(\\\"|[^\"])*(?!<\\)\""

texts=[r'"aerrrt"',
r'"a\"e'+"'"+'rrt"',
r'"a""""arrtt"""""',
r'"aerrrt',
r'"a\"errt'+"'",
r"'aerrrt'",
r"'a\'e"+'"'+"rrt'",
r"'a''''arrtt'''''",
r"'aerrrt",
r"'a\'errt"+'"',
      "''",'""',""]

for text in texts:
     print (text,"-->",re.fullmatch(r,text))

results:结果：

"aerrrt" --> <_sre.SRE_Match object; span=(0, 8), match='"aerrrt"'>
"a\"e'rrt" --> <_sre.SRE_Match object; span=(0, 10), match='"a\\"e\'rrt"'>
"a""""arrtt""""" --> None
"aerrrt --> None
"a\"errt' --> None
'aerrrt' --> <_sre.SRE_Match object; span=(0, 8), match="'aerrrt'">
'a\'e"rrt' --> <_sre.SRE_Match object; span=(0, 10), match='\'a\\\'e"rrt\''>
'a''''arrtt''''' --> None
'aerrrt --> None
'a\'errt" --> None
'' --> <_sre.SRE_Match object; span=(0, 2), match="''">
"" --> <_sre.SRE_Match object; span=(0, 2), match='""'>
 --> None

Python正则表达式在字符串中的双引号中查找字符串

问题描述

6 个解决方案

解决方案1
68 已采纳 2012-03-01 16:23:07

解决方案2
6 2020-09-02 13:52:22

解决方案3
4 2019-08-03 09:19:59

解决方案4
1 2021-11-09 00:18:23

解决方案5
0 2022-07-13 08:47:53

解决方案6
-1 2020-02-06 09:47:34

Python正则表达式在字符串中的双引号中查找字符串

问题描述

6 个解决方案

解决方案1 68 已采纳 2012-03-01 16:23:07

解决方案2 6 2020-09-02 13:52:22

解决方案3 4 2019-08-03 09:19:59

解决方案4 1 2021-11-09 00:18:23

解决方案5 0 2022-07-13 08:47:53

解决方案6 -1 2020-02-06 09:47:34

解决方案1
68 已采纳 2012-03-01 16:23:07

解决方案2
6 2020-09-02 13:52:22

解决方案3
4 2019-08-03 09:19:59

解决方案4
1 2021-11-09 00:18:23

解决方案5
0 2022-07-13 08:47:53

解决方案6
-1 2020-02-06 09:47:34