[英]Extracting data from plain text string
I am trying to process a report from a system which gives me the following code 我正在尝试处理来自系统的报告,该系统为我提供以下代码
000=[GEN] OK {Q=1 M=1 B=002 I=3e5e65656-e5dd-45678-b785-a05656569e}
I need to extract the values between the curly brackets {} and save them in to variables. 我需要提取大括号{}之间的值并将其保存到变量中。 I assume I will need to do this using regex or similar?
我认为我将需要使用正则表达式或类似工具来执行此操作? I've really no idea where to start!!
我真的不知道从哪里开始! I'm using c# asp.net 4.
我正在使用c#asp.net 4。
I need the following variables 我需要以下变量
param1 = 000
param2 = GEN
param3 = OK
param4 = 1 //Q
param5 = 1 //M
param6 = 002 //B
param7 = 3e5e65656-e5dd-45678-b785-a05656569e //I
I will name the params based on what they actually mean. 我将根据其实际含义来命名这些参数。 Can anyone please help me here?
有人可以在这里帮我吗? I have tried to split based on spaces, but I get the other garbage with it!
我试图根据空间进行拆分,但是我得到了其他垃圾!
Thanks for any pointers/help! 感谢您的任何指示/帮助!
If the format is pretty constant, you can use .NET string processing methods to pull out the values, something along the lines of 如果格式相当恒定,则可以使用.NET字符串处理方法提取值,类似于
string line =
"000=[GEN] OK {Q=1 M=1 B=002 I=3e5e65656-e5dd-45678-b785-a05656569e}";
int start = line.IndexOf('{');
int end = line.IndexOf('}');
string variablePart = line.Substring(start + 1, end - start);
string[] variables = variablePart.Split(' ');
foreach (string variable in variables)
{
string[] parts = variable.Split('=');
// parts[0] holds the variable name, parts[1] holds the value
}
Wrote this off the top of my head, so there may be an off-by-one error somewhere. 把它写在我的头顶上,所以某个地方可能会有一个错误的错误。 Also, it would be advisable to add error checking eg to make sure the input string has both a { and a }.
此外,建议添加错误检查,例如确保输入字符串同时包含{和}。
Use a regular expression. 使用正则表达式。
Quick and dirty attempt: 快速而肮脏的尝试:
(?<ID1>[0-9]*)=\[(?<GEN>[a-zA-Z]*)\] OK {Q=(?<Q>[0-9]*) M=(?<M>[0-9]*) B=(?<B>[0-9]*) I=(?<I>[a-zA-Z0-9\-]*)}
This will generate named groups called ID1
, GEN
, Q
, M
, B
and I
. 这将生成名为
ID1
, GEN
, Q
, M
, B
和I
命名组。
Check out the MSDN docs for details on using Regular Expressions in C#. 请查阅MSDN文档,以获取有关在C#中使用正则表达式的详细信息。
You can use Regex Hero for quick C# regex testing. 您可以使用Regex Hero进行快速的C#regex测试。
I would suggest a regular expression for this type of work. 我建议为此类工作使用正则表达式。
var objRegex = new System.Text.RegularExpressions.Regex(@"^(\d+)=\[([A-Z]+)\] ([A-Z]+) \{Q=(\d+) M=(\d+) B=(\d+) I=([a-z0-9\-]+)\}$");
var objMatch = objRegex.Match("000=[GEN] OK {Q=1 M=1 B=002 I=3e5e65656-e5dd-45678-b785-a05656569e}");
if (objMatch.Success)
{
Console.WriteLine(objMatch.Groups[1].ToString());
Console.WriteLine(objMatch.Groups[2].ToString());
Console.WriteLine(objMatch.Groups[3].ToString());
Console.WriteLine(objMatch.Groups[4].ToString());
Console.WriteLine(objMatch.Groups[5].ToString());
Console.WriteLine(objMatch.Groups[6].ToString());
Console.WriteLine(objMatch.Groups[7].ToString());
}
I've just tested this out and it works well for me. 我刚刚测试了一下,对我来说效果很好。
You can use String.Split
您可以使用
String.Split
string[] parts = s.Split(new string[] {"=[", "] ", " {Q=", " M=", " B=", " I=", "}"},
StringSplitOptions.None);
This solution breaks up your report code into segments and stores the desired values into an array. 此解决方案将您的报告代码分解为多个段,并将所需的值存储到数组中。
The regular expression matches one report code segment at a time and stores the appropriate values in the "Parsed Report Code Array". 正则表达式一次匹配一个报告代码段,并将适当的值存储在“已分析的报告代码数组”中。
As your example implied, the first two code segments are treated differently than the ones after that. 如您的示例所示,前两个代码段的处理方式与之后的代码段不同。 I made the assumption that it is always the first two segments that are processed differently.
我假设总是前两个段被不同地处理。
private static string[] ParseReportCode(string reportCode) {
const int FIRST_VALUE_ONLY_SEGMENT = 3;
const int GRP_SEGMENT_NAME = 1;
const int GRP_SEGMENT_VALUE = 2;
Regex reportCodeSegmentPattern = new Regex(@"\s*([^\}\{=\s]+)(?:=\[?([^\s\]\}]+)\]?)?");
Match matchReportCodeSegment = reportCodeSegmentPattern.Match(reportCode);
List<string> parsedCodeSegmentElements = new List<string>();
int segmentCount = 0;
while (matchReportCodeSegment.Success) {
if (++segmentCount < FIRST_VALUE_ONLY_SEGMENT) {
string segmentName = matchReportCodeSegment.Groups[GRP_SEGMENT_NAME].Value;
parsedCodeSegmentElements.Add(segmentName);
}
string segmentValue = matchReportCodeSegment.Groups[GRP_SEGMENT_VALUE].Value;
if (segmentValue.Length > 0) parsedCodeSegmentElements.Add(segmentValue);
matchReportCodeSegment = matchReportCodeSegment.NextMatch();
}
return parsedCodeSegmentElements.ToArray();
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.