简体   繁体   中英

Issue with string matching the Regex pattern and replacing with matched

Requirement is that the string variable text2 containing table data having dd/mm/yyyy hh:mm format need's to replace the date and time enclosed with double quotes with ="dd-MMM-yyyy HH:mm:ss"

Eg: 25-Feb-2020 15:27:58 need to be replaced with ="25-Feb-2020 15:27:58"

DotNetFiddler

Here is the complete snippet of code shown below

using System;
using System.Text.RegularExpressions;


public class Program
{
    public static void Main()
    {
        string text = "<table>\n  <thead><tr><th style=\"\"><div class=\"th-inner \">Login Name</div><div class=\"fht-cell\"></div></th><th style=\"\"><div class=\"th-inner sortable\">Registered</div><div class=\"fht-cell\"></div></th><th style=\"\"><div class=\"th-inner \">Registered Date <br>Time</div><div class=\"fht-cell\"></div></th><th style=\"\"><div class=\"th-inner sortable\">User Response Count</div><div class=\"fht-cell\"></div></th><th style=\"\"><div class=\"th-inner \">Test Start Date Time</div><div class=\"fht-cell\"></div></th><th style=\"\"><div class=\"th-inner \">Test End Date Time</div><div class=\"fht-cell\"></div></th><th style=\"\"><div class=\"th-inner \">Time Remaining</div><div class=\"fht-cell\"></div></th><th style=\"\"><div class=\"th-inner \">User Status</div><div class=\"fht-cell\"></div></th></tr></thead><tbody><tr data-index=\"9\"><td style=\"\">njuser14</td><td style=\"\">Yes</td><td style=\"\">-</td><td style=\"\">0</td><td style=\"\">29-Feb-2020 15:27:58</td><td style=\"\">29-Feb-2020 15:28:03</td><td style=\"\">179</td><td style=\"\">Paused</td></tr><tr data-index=\"10\"><td style=\"\">njuser15</td><td style=\"\">Yes</td><td style=\"\">-</td><td style=\"\">0</td><td style=\"\">29-Feb-2020 15:27:32</td><td style=\"\">29-Feb-2020 15:27:42</td><td style=\"\">179</td><td style=\"\">Paused</td></tr></tbody></table>";
        string text2 = " dasd arew 2017-03-11 12:25:56 2017-03-11 12:25:56 das tfgwe 2017-03-11 12:25:56 ";
        string pattern = @"\d{4}\-\d{2}\-\d{2}\s\d{2}\:\d{2}\:\d{2}";
        Regex r = new Regex(pattern);
        var res = r.Replace(text, new MatchEvaluator(ConvertDateFormat));
        var res2 = r.Replace(text2, new MatchEvaluator(ConvertDateFormat));
        Console.WriteLine(res);
        Console.WriteLine("-------------------------------------------------------");
        Console.WriteLine(res2);
    }

    static string ConvertDateFormat(Match m)
    {
        var mydate = DateTime.Parse(m.Value);
        return mydate.ToString("=yyyy-MM-dd hh:mm:ss");
    }
}

// 29-Feb-2020 15:27:58 need to be replaced with ="29-Feb-2020 15:27:58"

Results:

<table>
  <thead><tr><th style=""><div class="th-inner ">Login Name</div><div class="fht-cell"></div></th><th style=""><div class="th-inner sortable">Registered</div><div class="fht-cell"></div></th><th style=""><div class="th-inner ">Registered Date <br>Time</div><div class="fht-cell"></div></th><th style=""><div class="th-inner sortable">User Response Count</div><div class="fht-cell"></div></th><th style=""><div class="th-inner ">Test Start Date Time</div><div class="fht-cell"></div></th><th style=""><div class="th-inner ">Test End Date Time</div><div class="fht-cell"></div></th><th style=""><div class="th-inner ">Time Remaining</div><div class="fht-cell"></div></th><th style=""><div class="th-inner ">User Status</div><div class="fht-cell"></div></th></tr></thead><tbody><tr data-index="9"><td style="">njuser14</td><td style="">Yes</td><td style="">-</td><td style="">0</td><td style="">29-Feb-2020 15:27:58</td><td style="">29-Feb-2020 15:28:03</td><td style="">179</td><td style="">Paused</td></tr><tr data-index="10"><td style="">njuser15</td><td style="">Yes</td><td style="">-</td><td style="">0</td><td style="">29-Feb-2020 15:27:32</td><td style="">29-Feb-2020 15:27:42</td><td style="">179</td><td style="">Paused</td></tr></tbody></table>
-------------------------------------------------------
 dasd arew =2017-03-11 12:25:56 =2017-03-11 12:25:56 das tfgwe =2017-03-11 12:25:56

But here the string variable

  1. text2 value is being replaced to =dd-MMM-yyyy HH:mm:ss . But not to "=dd-MMM-yyyy HH:mm:ss"
  2. text value remains same. But not to "=dd-MMM-yyyy HH:mm:ss"

As per the comments, the first issue seems to be the expectation that

return mydate.ToString("=yyyy-MM-dd hh:mm:ss");

Will include the quotes when it converts the DataTime format to a string. But these quotes are actually the terminators of th format string itself, and are not a part of the format string.

The solution to that is as suggested by Justin

string.Format("=\"{0}\"", mydate.ToString("yyyy-MM-dd hh:mm:ss"))

Although my preferred format would use string interpolation

$"\"{mydate.ToString("yyyy-MM-dd hh:mm:ss")}\""

The second issue is that text and text2 have differing date time formats, and the supplied regex only matches the formats in text2

text:  29-Feb-2020 15:27:58 
text2: 2017-03-11 12:25:56 
regex: @"\d{4}\-\d{2}\-\d{2}\s\d{2}\:\d{2}\:\d{2}"

Regexes match strings and are not aware of the data that they are matching. So a naive regex for text would be something like (untested)

@"\d{2}\-[a-zA-Z]{3}\-\d{4}\s\d{2}\:\d{2}\:\d{2}"

This assumes that the months are always 3 chars long, and that there is nothing that looks like a date that is not a date.

Your example explicitly does 2 different matches, so if that is how you do things then you could create a new regex for each of text and text2 and do multiple replacements. Or you could try combining the regexes like (untested):

@"\d{4}\-\d{2}\-\d{2}\s\d{2}\:\d{2}\:\d{2}|\d{2}\-[a-zA-Z]{3}\-\d{4}\s\d{2}\:\d{2}\:\d{2}"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM