繁体   English   中英

Regex Expression可以获得双引号之间的所有内容

[英]Regex Expression to get everything between double quotes

我正在尝试使用正则表达式来处理一串多行文本。 需要这个为python工作。

示例文字:

description : "4.10 TCP Wrappers - not installed"
info        : "If some of the services running in /etc/inetd.conf are 

required, then it is recommended that TCP Wrappers are installed and configured to limit access to any active TCP and UDP services.

TCP Wrappers allow the administrator to control who has access to various inetd network services via source IP address controls. TCP Wrappers also provide logging information via syslog about both successful and unsuccessful connections.

TCP Wrappers are generally triggered via /etc/inetd.conf, but other options exist for \"wrappering\" non-inetd based software.

The configuration of TCP Wrappers to suit a particular environment is outside the scope of this benchmark; however the following links will provide the necessary documentation to plan an appropriate implementation:

ftp://ftp.porcupine.org/pub/security/index.html

The website contains source code for both IPv4 and IPv6 versions."

expect      : "^[\\s]*[A-Za-z0-9]+:[\\s]+[^A][^L][^L]"
required        : YES

我想出了这个,

[(a-zA-Z_ \t#)]*[:][ ]*\"[^\"]*.*\"

但问题是它停在第二个\\“未选中该线的其余部分。

我的目标是让整个字符串从info开始直到双引号的末尾,与信息行相关。

同样的正则表达式也适用于'expect'行,从期望结束与期望字符串相关的双引号开始。

一旦我得到整个字符串,我将把它拆分为第一个“:”因为我想将这些字符串存储到DB中,其中“description”,“info”,“expect”作为列,然后字符串作为这些列中的值。

感谢帮助!

一种替代方法是使用shlex模块中提供的shlex

>>> s = """tester : "this is a long string
that
is multiline, contains \\" double qoutes \\" and .
this line is finished\""""
>>> shlex.split(s[s.find('"'):])[0]
'this is a long string\nthat\nis multiline, contains " double qoutes " and .\nthis line is finished'

它还将从字符串内的双引号中删除后退。

代码在字符串中找到第一个双引号,只查看从那里开始的字符串。 然后使用shlex.split()来标记字符串的其余部分,并从返回的列表中获取第一个标记。

更新1:我让这个工作:

[(a-zA-Z_ \t#)]*[:][ ]*\"([^\"]|(?<=\\\\)[\"])*\"

更新2:如果你不能修改文件,在上面的表达式必要的地方添加转义引号,那么只要行如

group : "@GROUP@" || "test"

只存在单行,那么我认为这将抓住那些与更长的引用值:

[(a-zA-Z_ \t#)]*[:][ ]*(?:\"([^\"]|(?<=\\\\)[\"])*\"|.*)(?=(?:\r\n|$))

试试吧,如果它有效,我会再次更新来解释它。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM