[英]Using regex to find date string in file
I need to find a specific date string in a text file. 我需要在文本文件中找到特定的日期字符串。 There currently are two date strings in the file - "Due Date: 01/26/2016" and "Date: 01/252016".
文件中当前有两个日期字符串-“到期日期:01/26/2016”和“日期:01/252016”。 I need to find the second one but my current code only finds the first one.
我需要找到第二个,但是我当前的代码只能找到第一个。 I am guessing regex would be a better implementation but not sure how to code for it.
我猜正则表达式将是一个更好的实现,但不确定如何为其编写代码。
Current code - 当前代码-
searchString = "Date:";
if (fileContents.IndexOf(searchString) > 0)
{
string tmp = fileContents.Substring(fileContents.IndexOf(searchString) + searchString.Length).Trim();
string loan_date = tmp.Substring(0, tmp.IndexOf('\r')).Trim();
if (loan_date.Count(x => x == '/') == 1)
{
StringBuilder sb = new StringBuilder(loan_date);
sb[sb.Length - 4] = '/';
loan_date = sb.ToString();
}
DateTime dt = DateTime.ParseExact(loan_date, "M/d/yyyy", System.Globalization.CultureInfo.InvariantCulture);
return dt;
}
In C#, you can find matches to a regex by doing something like the following. 在C#中,您可以通过执行以下操作找到与正则表达式的匹配项。
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = "[0-1]?[0-9]/[0-9]{2}/[0-9]{4}";
string input = "Due Date: 01/26/2016 Date: 01/25/2016";
foreach (var m in Regex.Matches(input, pattern)) {
Console.WriteLine("'{0}' found at index {1}.",
m.Value, m.Index);
}
}
}
That regex specifically means 0 or 1 (optional) followed by a digit, followed by a slash, followed by two digits, followed by a slash, followed by four digits. 该正则表达式特别表示0或1(可选),后跟一个数字,后跟一个斜杠,然后是两个数字,然后是一个斜杠,然后是四个数字。
I'm also assuming your second date 01/252016 contains a typo. 我还假设您的第二个约会01/252016包含错字。
Try this Regex: 试试这个正则表达式:
(Due\s)?(Date:)\s([0-1][0-2])\/([0-3][0-9])\/([0-2][0-9]{3})
Since both strings include "Date", we can use that to further filter out other strings (you might not actually want all dates). 由于两个字符串都包含“日期”,因此我们可以使用它进一步过滤掉其他字符串(您可能实际上并不需要所有日期)。 Since Due is optional, we can mark it as so.
由于Due是可选的,因此我们可以将其标记为。 It's a little tough to filter out poorly formatted dates, but you can limit a few things (like I have above).
过滤掉格式不正确的日期有些困难,但是您可以限制一些事情(例如我上面提到的)。 You will have to validate the date separately just to be sure.
您必须单独确认日期才能确定。
Here is a Regex that will not care about the checks as long as it's formatted correctly: 这是一个正则表达式,只要格式正确,它就不会关心检查:
(Due\s)?(Date:)\s([0-9]{2})\/([0-9]{2})\/([0-9]{4})
Or just the dates: 或者只是日期:
([0-9]{2})\/([0-9]{2})\/([0-9]{4})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.