[英]parse text into key/value pair or json
I have text in the following format, I was wondering what the best approach might be to create a user object from it with the fields as its properties. 我的文本格式如下,我想知道最好的方法是从字段中将其作为属性从中创建用户对象。
I dont know regular expressions that well and i was looking at the string methods in csharp particularly IndexOf and LastIndexOf, but i think that would be too messy as there are approximately 15 fields. 我不太了解正则表达式,我在看csharp中的字符串方法,尤其是IndexOf和LastIndexOf,但是我认为这太混乱了,因为大约有15个字段。
I am trying to do this in c sharp 我正在尝试使用C Sharp
Some characteristics: 一些特点:
Title: Mr Company: abc capital Address1: 42 mystery lane Zip: 112312 Country: Ireland Interest: Biking, Swimming, Hiking, Topic of Interest: Europe, Asia, Capital
This will split the the data up into key value pairs and store them in a dictionary. 这会将数据拆分为键值对,并将其存储在字典中。 You may have to modify further for more requirements.
您可能需要进一步修改才能满足更多要求。
var dictionary = data
.Split(
new[] {"\r\n"},
StringSplitOptions.RemoveEmptyEntries)
.Select(x => x.Split(':'))
.ToDictionary(
k => k[0].Trim(),
v => v[1].Trim());
I'd probably go with something like this: 我可能会选择这样的东西:
private Dictionary<string, IEnumerable<string>> ParseValues(string providedValues)
{
Dictionary<string, IEnumerable<string>> parsedValues = new Dictionary<string, IEnumerable<string>>();
string[] lines = providedValues.Split(Environment.NewLine.ToCharArray(), StringSplitOptions.RemoveEmptyEntries); //Your newline character here might differ, being '\r', '\n', '\r\n'...
foreach (string line in lines)
{
string[] lineSplit = line.Split(':');
string key = lineSplit[0].Trim();
IEnumerable<string> values = lineSplit[1].Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries).Select(x => x.Trim()); //Removing empty entries here will ensure you don't get an empty for the "Interest" line, where you have 'Hiking' followed by a comma, followed by nothing else
parsedValues.Add(key, values);
}
return parsedValues;
}
or if you subscribe to the notion that readability and maintainability are not as cool as a great big chain of calls: 或者,如果您赞成可读性和可维护性不如大型调用链那么酷的概念:
private static Dictionary<string, IEnumerable<string>> ParseValues(string providedValues)
{
return providedValues.Split(Environment.NewLine.ToCharArray(), StringSplitOptions.RemoveEmptyEntries).Select(x => x.Split(':')).ToDictionary(key => key[0].Trim(), value => value[1].Split(new char[]{ ','}, StringSplitOptions.RemoveEmptyEntries).Select(x => x.Trim()));
}
I strongly recomend getting more familiar wit regexp for those cases. 对于这些情况,我强烈建议您更熟悉机智的正则表达式。 Parsing "half" structured text is very easy and logic with regular exp.
解析“半”结构化文本非常容易,并且具有常规exp的逻辑。
for ex. 对于前。 this (and other following are just variants there are many ways to do it depending on what you need)
这个(以及下面的其他只是变体,根据您的需要,有很多方法可以做到)
title:\s*(.*)\s+comp.*?:\s*(.*)\s+addr.*?:\s*(.*)\s+zip:\s*(.*)\s+country:\s*(.*)\s+inter.*?:\s*(.*)\s+topic.*?:\s*(.*)
gives result 给出结果
1. Mr
2. abc capital
3. 42 mystery lane
4. 112312
5. Ireland
6. Biking, Swimming, Hiking,
7. Europe, Asia, Capital
or - more open to anything: 或-对任何事物都更开放:
\s(.*?):\s(.*)
parses your input into nice groups like this: 将您的输入解析成不错的组,如下所示:
Match 1
1. Title
2. Mr
Match 2
1. Company
2. abc capital
Match 3
1. Address1
2. 42 mystery lane
Match 4
1. Zip
2. 112312
Match 5
1. Country
2. Ireland
Match 6
1. Interest
2. Biking, Swimming, Hiking,
Match 7
1. Topic of Interest
2. Europe, Asia, Capital
I am not familiar with c# (and its dialect of regexp), I just wanted do awake your interest ... 我不熟悉c#(及其正则表达式的方言),我只是想唤醒您的兴趣...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.