[英]How to parse this string in the best way
I have several strings that looks like this one: 我有几个看起来像这样的字符串:
\r\n\t\StaticWord1:\r\n\t\t2014-05-20 11:03\r\n\t\StaticWord2\r\n\t\t\r\n\t\t\r\n\t\t\t\r\n\t\t\t\r\n\t\t\t\t\t\t\t\t\tWordC WordD\r\n\t\t\t\t\t\t\t\t\r\n\t\t\r\n\t\t\r\n\t\t\r\n\t
I would like to get the date ( 2014-05-20 11:03
in my example - but will vary), Word C
and D
. 我想获取日期(示例中为2014-05-20 11:03
但会有所不同), Word C
和D
(Both C
and D
can be any sequence of letters). ( C
和D
都可以是任何字母序列)。
How would I parse this as efficient as possible? 我将如何解析尽可能高的效率? I was thinking about using the String.Replace
method but I think a regex would be better? 我当时在考虑使用String.Replace
方法,但我认为使用正则表达式会更好? (C#) (C#)
Use this capture string : 使用以下捕获字符串:
Match match = Regex.Match(input, @"(\d\d\d\d-\d\d-\d\d \d\d:\d\d)",
RegexOptions.Multiline);
if (match.Success)
{
string key = match.Groups[1].Value;
DateTime date = DateTime.ParseExact(key, "yyyy-MM-dd HH:mm", CultureInfo.InvariantCulture); // Your result is here
}
I don't know if it's the best way but you can use a split like in this msdn example : http://msdn.microsoft.com/en-us/library/ms228388.aspx 我不知道这是否是最好的方法,但是您可以在此msdn示例中使用拆分: http : //msdn.microsoft.com/zh-cn/library/ms228388.aspx
With this example you can easily create an array like in the example and split your string with \\t \\n \\r ... and with a loop get all your words : 在这个例子中,您可以像在例子中那样轻松地创建一个数组,并用\\ t \\ n \\ r ...分割字符串,并使用循环获取所有单词:
class TestStringSplit
{
static void Main()
{
char[] delimiterChars = { '\r', '\n', '\t' };
string text = "\r\n\t\StaticWord1:\r\n\t\t2014-05-20 11:03\r\n\t\StaticWord2\r\n\t\t\r\n\t\t\r\n\t\t\t\r\n\t\t\t\r\n\t\t\t\t\t\t\t\t\tWordC WordD\r\n\t\t\t\t\t\t\t\t\r\n\t\t\r\n\t\t\r\n\t\t\r\n\t";
System.Console.WriteLine("Original text: '{0}'", text);
string[] words = text.Split(delimiterChars);
System.Console.WriteLine("{0} words in text:", words.Length);
foreach (string s in words)
{
System.Console.WriteLine(s);
}
// Keep the console window open in debug mode.
System.Console.WriteLine("Press any key to exit.");
System.Console.ReadKey();
}
}
string input = @"\r\n\t\StaticWord1:\r\n\t\t2014-05-20 11:03\r\n\t\StaticWord2\r\n\t\t\r\n\t\t\r\n\t\t\t\r\n\t\t\t\r\n\t\t\t\t\t\t\t\t\tWordC WordD\r\n\t\t\t\t\t\t\t\t\r\n\t\t\r\n\t\t\r\n\t\t\r\n\t";
string pattern = @"(\d{4}\-\d{2}\-\d{2}\s\d{2}:\d{2})(?:[\\r\\n\\t]*StaticWord2[\\r\\n\\t]*)(\w+)\s(\w+)";
Match match = Regex.Match(input, pattern);
Then to get the values: 然后获取值:
match.Groups[1].Value; // date-time
match.Groups[2].Value; // WordC
match.Groups[3].Value; // WordD
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.