正则表达式提取三重双引号和换行符之间

Question

For example i want to parse python file with text between triple double quotes and make html table from this text. 例如，我想用三重双引号之间的文本解析python文件，并根据此文本创建html表。

Text block for example like that 像这样的文本块

"""
Replaces greater than operator ('>') with 'NOT BETWEEN 0 AND #'
Replaces equals operator ('=') with 'BETWEEN # AND #'

Tested against:
    * Microsoft SQL Server 2005
    * MySQL 4, 5.0 and 5.5
    * Oracle 10g
    * PostgreSQL 8.3, 8.4, 9.0

Requirement:
    * Microsoft Access

Notes:
    * Useful to bypass weak and bespoke web application firewalls that
      filter the greater than character
    * The BETWEEN clause is SQL standard. Hence, this tamper script
      should work against all (?) databases

>>> tamper('1 AND A > B--')
'1 AND A NOT BETWEEN 0 AND B--'
>>> tamper('1 AND A = B--')
'1 AND A BETWEEN B AND B--'
"""

Html table must be simple table contains 5 columns HTML表必须是简单表，包含5列

Column everything between """ and \\n if new line is empty \\n if new line is empty在"""和\\n if new line is empty之间的所有列
Column everything between Tested against: and \\n if new line is empty or Requirement: and \\n if new line is empty 在“ Tested against:和\\n if new line is empty或“ Requirement:和\\n if new line is empty
Column everything between Notes: and \\n if new line is empty 在Notes:和\\n if new line is empty之间插入所有内容， \\n if new line is empty
Column everything between >>> and \\n 列>>>和\\n之间的所有内容
Column everything between 4 column end and \\n 在4 column end与\\n之间的所有4 column end

So result must be: 因此结果必须是：

Replaces greater than operator ('>') with 'NOT BETWEEN 0 AND #' Replaces equals operator ('=') with 'BETWEEN # AND #' 将大于运算符（'>'）替换为'NOT BETWEEN 0 AND＃'用'BETWEEN＃AND＃'替换等于运算符（'='）
- Microsoft SQL Server 2005 Microsoft SQL Server 2005
  - MySQL 4, 5.0 and 5.5 MySQL 4、5.0和5.5
  - Oracle 10g 甲骨文10g
  - PostgreSQL 8.3, 8.4, 9.0 PostgreSQL 8.3、8.4、9.0
  or 要么
  - Microsoft Access Microsoft Access
- Useful to bypass weak and bespoke web application firewalls that filter the greater than character 绕过弱且定制的Web应用程序防火墙很有用，该防火墙过滤大于字符
- The BETWEEN clause is SQL standard. BETWEEN子句是SQL标准。 Hence, this tamper script should work against all (?) databases 因此，此篡改脚本应适用于所有（？）数据库
tamper('1 AND A > B--') tamper('1 AND A = B--') 篡改（'1 AND A> B--'）篡改（'1 AND A = B--'）
'1 AND A NOT BETWEEN 0 AND B--' '1 AND A BETWEEN B AND B--' '1和A不在0和B之间-''1和A在B和B之间-'

What kind of syntax can i use to extract that? 我可以使用哪种语法来提取它？ I will use VBScript.RegExp . 我将使用VBScript.RegExp。

Set fso = CreateObject("Scripting.FileSystemObject")
txt = fso.OpenTextFile("C:\path\to\your.py").ReadAll

Set re = New RegExp
re.Pattern = """([^""]*)"""
re.Global = True

For Each m In re.Execute(txt)
  WScript.Echo m.SubMatches(0)
Next

Answer 1

Your question is quite broad, so I'll just outline a way to deal with this. 您的问题涉及面很广，所以我将概述一种解决此问题的方法。 Otherwise I'd have to write the whole script for you, which isn't going to happen. 否则，我将不得不为您编写整个脚本，这将不会发生。

Extract everything between the docquotes. 提取文档引号之间的所有内容。 Use a regular expression like this to extract the text between the docquotes: 使用这样的正则表达式提取docquotes之间的文本：
```
 Set re1 = New RegExp re1.Pattern = """""""([\\s\\S]*?)""""""" For Each m In re1.Execute(txt) docstr = m.SubMatches(0) Next 
```
Note that you need to set the re.Global to True if you have more than 1 docstring in your file and want all of them processed. 请注意，如果文件中的文档字符串超过1个，并且希望所有字符串都经过处理，则需要将re.Global设置为True 。 Otherwise you'll get just the first match. 否则，您只会得到第一场比赛。
Remove leading and trailing whitespace with a second regular expression: 使用第二个正则表达式删除开头和结尾的空格：
```
 Set re2 = New RegExp re2.Pattern = "^\\s*|\\s*$" re2.Global = True 'find all matches docstr = re2.Replace(docstr, "") 
```
You can't use Trim for this, because the function handles only spaces, not other whitespace. 您不能为此使用Trim ，因为该函数只能处理空格，不能处理其他空格。
Either split the string at 2+ consecutive line breaks to get the doc sections, or use another regular expression to extract them: 在2个以上的连续换行符处分割字符串以获取doc部分，或使用另一个正则表达式提取它们：
```
 Set re3 = New RegExp re3.Pattern = "([\\s\\S]*?)\\r\\n\\r\\n" + "Tested against:\\r\\n([\\s\\S]*?)\\r\\n\\r\\n" + ... For Each m In re3.Execute(txt) descr = m.SubMatches(0) tested = m.SubMatches(1) ... Next 
```

Continue breaking down the sections until you have the elements you want to display. 继续细分各部分，直到拥有要显示的元素。 Then build the HTML from these elements. 然后从这些元素构建HTML。

正则表达式提取三重双引号和换行符之间

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-04-16 17:41:47

正则表达式提取三重双引号和换行符之间

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-04-16 17:41:47

解决方案1
2 已采纳 2017-04-16 17:41:47