简体   繁体   English

我需要帮助来构建正则表达式

[英]I need help for building a regex

It is my first time working with regex and I am a little lost.这是我第一次使用正则表达式,我有点迷茫。 To give you a little background, I am making a program that reads a text file line by line and it saves in a string called "line".为了给您一些背景知识,我正在制作一个程序,它可以逐行读取文本文件,并将其保存在一个名为“line”的字符串中。 If the line starts with either a tab o or a whitespace, followed by a number or number and dots (such as 1 or 1.2.1, for instance) followed by another tab or whitespace, it copies the line to another file.如果该行以制表符 o 或空格开头,后跟数字或数字和点(例如 1 或 1.2.1),然后是另一个制表符或空格,则会将该行复制到另一个文件。

So far I build this regex, but it does not work到目前为止,我构建了这个正则表达式,但它不起作用

            string pattern = @"(\t| ) *[0-9.] (\t| )";

            if (line.StartsWith(pattern))
            {

                //copy line

            }

Also, is line.StartsWith correct?另外, line.StartsWith 是否正确? Or should I use something like rgx.Matches(pattern)?或者我应该使用类似 rgx.Matches(pattern) 的东西吗?

Your pattern contains a character class without a quantifier, which will match either a single digit or dot.您的模式包含一个没有量词的字符类,它将匹配单个数字或点。

To prevent matching for example only dots you could first match digits followed by an optional part which matches a dot and then again digits [0-9]+(?:\\.[0-9]+)*为了防止仅匹配例如点,您可以先匹配数字,然后匹配一个匹配点的可选部分,然后再匹配数字[0-9]+(?:\\.[0-9]+)*

Note that in this part (\\t| ) there are 2 characters expected to match as the space in that part has meaning.请注意,在此部分(\\t| ) ,预计有 2 个字符会匹配,因为该部分中的空格具有含义。

You could simplify the pattern to use a character class to match either a tab or space instead of using an alternation and if you don't need the capturing group you could omit it.您可以简化模式以使用字符类来匹配制表符或空格而不是使用交替,如果您不需要捕获组,则可以省略它。

Instead of using StartsWith you could usefor example IsMatch您可以使用例如IsMatch而不是使用StartsWith

^[ \t][0-9]+(?:\.[0-9]+)*[ \t]
  • ^ Start of string ^字符串开始
  • [ \\t] Match a single tab or space [ \\t]匹配单个制表符或空格
  • [0-9]+ Match 1+ digits 0-9 [0-9]+匹配 1+ 个数字 0-9
  • (?:\\.[0-9]+)* Repeat 0+ times a dot and 1+ digits (?:\\.[0-9]+)*重复 0+ 次一个点和 1+ 个数字
  • [ \\t] Match a single tab or space [ \\t]匹配单个制表符或空格

Regex demo | 正则表达式演示| C# demo C# 演示

For example例如

string s = "\t1.2.1 ";
Regex regex = new Regex(@"^[ \t][0-9]+(?:\.[0-9]+)*[ \t]");

if (regex.IsMatch(s)) {
    //copy line
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM