繁体   English   中英

正则表达式将字符串及其之后的所有内容与另一个相同的字符串进行匹配

[英]Regex to match string and all content after it to another occurrence of the same string

我正在尝试将每个新的错误日志行与C#中的Regex匹配。 我希望每次出现的日期都有新的匹配项

[yyyy-MM-dd HH:mm:ss,ffff]

这是示例数据和我当前(不起作用)的解决方案:

正则表达式

(\[[0-9]{4}\-[0-9]{2}\-[0-9]{2} [0-9]{2}\:[0-9]{2}\:[0-9]{2}\,[0-9]{3}\])(.*)

要匹配的字符串

[2018-06-28 00:58:14,596] - INFO  - [54] - ProcessItemController - Processing url: http://somehttp.com/something.xml/
[2018-06-28 00:58:14,612] - ERROR - [54] - ProcessItemController - Processing Failed
System.UnauthorizedAccessException: Access to the path 'D:\SomePath\something.xlsx' is denied.
   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.File.InternalDelete(String path, Boolean checkHost)
   at Something.Processors.PathAttachmentExtractorProcessor.XmlParser(String path, String outputPath, ProcessingItem processingItem)
   at Something.Processors.EurekaInfoPathAttachmentExtractorProcessor.ProcessItem(ProcessingItem processingItem)
   at Something.ProcessItemController.Process(Item item)
[2018-06-28 00:58:14,627] - INFO  - [69] - ProcessItemController - Processing url: http://someurl.com/cables.xml/
[2018-06-28 00:58:14,627] - ERROR - [69] - ProcessItemController - Processing Failed
System.UnauthorizedAccessException: Access to the path 'D:\SomePath\anotherSomething.xlsx' is denied.
   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.File.InternalDelete(String path, Boolean checkHost)
   at Something.Processors.PathAttachmentExtractorProcessor.XmlParser(String path, String outputPath, ProcessingItem processingItem)
   at Something.Processors.PathAttachmentExtractorProcessor.ProcessItem(ProcessingItem processingItem)
   at Something.ProcessItemController.Process(Item item)

https://regex101.com/r/6BJpKF/1/

问题是,当有错误日志时,该模式不会获得新行中的异常描述。

有没有一种方法可以在单独的匹配中获取日期的每次出现之间的所有数据(带有日期本身)?

尝试以下解决方案:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Text.RegularExpressions;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.txt";
        static void Main(string[] args)
        {

            string input = File.ReadAllText(FILENAME);

            string pattern = @"^(?'date'\[[^\]]+)\]\s+-\s+(?'type'[^\s]+)\s+-\s+\[(?'message'[^\[]*)";

            MatchCollection matches = Regex.Matches(input, pattern, RegexOptions.Multiline);

            foreach (Match match in matches)
            {
                Console.WriteLine("Date : '{0}', Type : '{1}', Error Number = '{2}', Message = '[{3}'",
                   match.Groups["date"], match.Groups["type"], match.Groups["errNum"], match.Groups["message"]);
            }
            Console.ReadLine();
        }

    }
}

仅使用正则表达式应该可以工作:

string datetimeRegex = @"\[[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}\]";

var rx = new Regex(@"(?:^|(?<=\n))" + datetimeRegex + @"(?:(?!(?<=\n)" + datetimeRegex + @").)*", RegexOptions.Singleline);

Match m;
int ix = 0;

while ((m = rx.Match(str, ix)).Success)
{
    // Your log
    string log = m.Value;
    ix += log.Length;
}

但是我不是很高兴。 我认为这是可行的。 请注意,每个log将保留其最终\\r?\\n (?:^|(?<=\\n))意思是“字符串的开头或换行”。 (?!(?<=\\n)" + datetimeRegex + @")表示以\\n开头的日期时间将停止.*匹配。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM