简体   繁体   English

使用正则表达式从源代码中提取逗号分隔的单元

[英]Extract comma separated units from source code using RegEx

I want to use Regular Expressions to extract information from my source code.我想使用正则表达式从我的源代码中提取信息。 Can you help me to build a RegEx that retrieves the units used on the source code?.你能帮我构建一个 RegEx 来检索源代码中使用的单位吗?

Source code sample:源代码示例:

unit ComandesVendes;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, Manteniment;

type
  TFComandesVendes = class(TFManteniment,ActualitzacioFinestra)
    QRCapsaleraNumero: TIntegerField;
    QRCapsaleraData: TDateTimeField;
    QRCapsaleraDataEntrega: TDateTimeField;
...
...     

I need to get the comma-separated file names since the uses clause up to the next ;我需要获取逗号分隔的文件名,因为uses子句到下一个; . . In that sample the output must be:在该示例中,output 必须是:

Windows
Messages
SysUtils
Variants
Classes
Graphics
Controls
Forms
Dialogs
Manteniment

I'm trying something like我正在尝试类似的东西

^ *uses(\n* *(\w*),)* *\n* *(\w*) *;

It matches the uses clause, but it doesn't return each file name separately.它匹配 uses 子句,但它不会分别返回每个文件名。

Thank you.谢谢你。

At this page it says that Delphi uses the PCRE regex flavor.此页面上,它说 Delphi 使用 PCRE 正则表达式风格。

In that case, one option is to use a capturing group in combination with the \G anchor.在这种情况下,一种选择是结合使用捕获组和\G锚点。

(?:^ *uses\r?\n *|\G(?!^))(\w+)(?:,\s*|;$)

Explanation解释

  • (?: Non capture group (?:非捕获组
    • ^ *uses\r?\n * Match optional spaces from the start of the string, then match and a newline followed by optional spaces again ^ *uses\r?\n *从字符串的开头匹配可选空格,然后再次匹配一个换行符,后跟可选空格
    • | Or要么
    • \G(?!^) Assert the position at the end of the previous match, not at the start (The \G anchor matches at 2 positions, either at the start of the string or at the end of the previous match) \G(?!^)在上一场比赛结束时断言 position,而不是在开始( \G锚点匹配 2 个位置,要么在字符串的开头,要么在上一场比赛的结尾)
  • ) Close non capture group )关闭非捕获组
  • (\w+) Capture group 1 Match 1+ word characters (\w+)捕获组1匹配1+个单词字符
  • (?:,\s*|;$) Non capture group, match either a comma and 0+ whitespace chars or match ; (?:,\s*|;$)非捕获组,匹配逗号和 0+ 个空白字符或匹配; at the end of the string.在字符串的末尾。

Regex demo正则表达式演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用正则表达式从 java 中的逗号分隔字符串中提取特定单词 - extract specific word from comma separated string in java using regex 正则表达式从字符串中提取用逗号分隔的字符串 - Regex to extract string separated by comma from a string R中的正则表达式:从字符串中提取逗号分隔的关键字 - Regex in R: extract comma separated keywords from string 如何使用正则表达式从 HTML 源代码中提取 JSON - How to Extract JSON From HTML Source Code Using Regex 使用正则表达式从逗号分隔的字符串中提取值 - Extracting Values from a comma separated String using Regex 如何使用RegEx从逗号分隔的字符串中提取潜在的电子邮件地址和间距 - How can I use a RegEx to extract potential email addresses and spacing from a comma-separated string 从逗号分隔的字符串中提取第二个单词没有尾随空格的最可读正则表达式是什么? - What is the most readable regex to extract a second word with no trailing spaces from comma-separated string? 使用正则表达式匹配多个逗号分隔的单词 - Using regex to match multiple comma separated words 使用正则表达式对用逗号分隔的字符串进行唯一检查 - unique check using regex for string separated by comma 使用正则表达式匹配一个数字作为逗号分隔的数字 - match a number using regex for comma separated number
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM