简体   繁体   English

需要一些正则表达式匹配/替换模式的帮助

[英]Need some help on a Regex match / replace pattern

My ultimate goal here is to turn the following string into JSON, but I would settle for something that gets me one step closer by combining the fieldname with each of the values. 我的最终目标是将以下字符串转换为JSON,但我会通过将fieldname与每个值组合来让我更接近一些。

Sample Data: 样本数据:

Field1:abc;def;Field2:asd;fgh;

Using Regex.Replace(), I need it to at least look like this: 使用Regex.Replace(),我需要它至少看起来像这样:

Field1:abc,Field1:def,Field2:asd,Field2:fgh

Ultimately, this result would be awesome if it can be done via Regex in a single call. 最终,如果可以通过一次调用中的Regex完成,这个结果将是非常棒的。

{"Field1":"abc","Field2":"asd"},{"Field1":"def","Field2":"fgh"}

I've tried many different variations of this pattern, but can't seem to get it right: 我已经尝试过这种模式的许多不同变体,但似乎无法做到正确:

(?:(\w+):)*?(?:([^:;]+);)

Only one other example I could find that is doing something similar, but just enough differences that I can't quite put my finger on it. 只有一个我能找到的其他例子就是在做类似的事情,但只有足够的差异,我无法完全理解它。

Regex to repeat a capture across a CDL? 正则表达式在CDL上重复捕获?

EDIT:

Here's my solution. 这是我的解决方案。 I'm not going to post it as a "Solution" because I want to give credit to one that was posted by others. 我不会将其作为“解决方案”发布,因为我想赞扬其他人发布的内容。 In the end, I took a piece from each of the posted solutions and came up with this one. 最后,我从每个发布的解决方案中拿出一块,然后想出了这个。 Thanks to everyone who posted. 感谢所有发布的人。 I gave credit to the solution that compiled, executed fastest and had the most accurate results. 我认为编译,执行最快并且结果最准确的解决方案。

    string hbi = "Field1:aaa;bbb;ccc;ddd;Field2:111;222;333;444;";

    Regex re = new Regex(@"(\w+):(?:([^:;]+);)+");
    MatchCollection matches = re.Matches(hbi);

    SortedDictionary<string, string> dict = new SortedDictionary<string, string>();

    for (int x = 0; x < matches.Count; x++)
    {
        Match match = matches[x];
        string property = match.Groups[1].Value;

        for (int i = 0; i < match.Groups[2].Captures.Count; i++)
        {
            string key = i.ToString() + x.ToString();
            dict.Add(key, string.Format("\"{0}\":\"{1}\"", property, match.Groups[2].Captures[i].Value));
        }
    }
    Console.WriteLine(string.Join(",", dict.Values));

Now you have two problems 现在你有两个问题

I don't think regular expressions will be the best way to handle this. 我认为正则表达式不是解决这个问题的最佳方法。 You should probably start by splitting on semicolons, then loop through the results looking for a value that starts with "Field1:" or "Field2:" and collect the results into a Dictionary. 您应该首先分割分号,然后遍历结果,查找以“Field1:”或“Field2:”开头的值,并将结果收集到Dictionary中。

Treat this as pseudo code because I have not compiled or tested it: 将其视为伪代码,因为我没有编译或测试它:

string[] data = input.Split(';');
dictionary<string, string> map = new dictionary<string, string>();

string currentKey = null;
foreach (string value in data)
{
    // This part should change depending on how the fields are defined.
    // If it's a fixed set you could have an array of fields to search,
    // or you might need to use a regular expression.
    if (value.IndexOf("Field1:") == 0 || value.IndexOf("Field2:"))
    {
        string currentKey = value.Substring(0, value.IndexOf(":"));
        value = value.Substring(currentKey.Length+1);
    }
    map[currentKey] = value;
}
// convert map to json

I would go with RegEx as the simplest and most straightforward way to parse the strings, but I'm sorry, pal, I couldn't come up with a clever-enough replacement string to do this in one shot. 我会使用RegEx作为解析字符串的最简单,最简单的方法,但是我很抱歉,朋友,我无法想出一个足够聪明的替换字符串来一次完成这个。

I hacked it out for fun through, and the monstrosity below accomplishes what you need, albeit hideously. 我为了好玩而把它砍掉了,下面的怪物完成了你所需要的东西,尽管很可怕。 :-/ : - /

        Regex r = new Regex(@"(?<FieldName>\w+:)*(?:(?<Value>(?:[^:;]+);)+)");

        var matches = r.Matches("Field1:abc;def;Field2:asd;fgh;moo;"); // Modified to test "uneven" data as well.

        var tuples = new[] { new { FieldName = "", Value = "", Index = 0 } }.ToList(); tuples.Clear();

        foreach (Match match in matches)
        {
            var matchGroups = match.Groups;
            var fieldName = matchGroups[1].Captures[0].Value;
            int index = 0;
            foreach (Capture cap in matchGroups[2].Captures)
            {
                var tuple = new { FieldName = fieldName, Value = cap.Value, Index = index };
                tuples.Add(tuple);
                index++;
            }

        }

        var maxIndex = tuples.Max(tup => tup.Index);

        var jsonItemList = new List<string>();

        for (int a = 0; a < maxIndex+1; a++)
        {
            var jsonBuilder = new StringBuilder();
            jsonBuilder.Append("{");

            foreach (var tuple in tuples.Where(tup => tup.Index == a))
            {
                jsonBuilder.Append(string.Format("\"{0}\":\"{1}\",", tuple.FieldName, tuple.Value));
            }
            jsonBuilder.Remove(jsonBuilder.Length - 1, 1); // trim last comma.
            jsonBuilder.Append("}");
            jsonItemList.Add(jsonBuilder.ToString());
        }

        foreach (var item in jsonItemList)
        {
            // Write your items to your document stream.
        }

I had an idea that it should be possible to do this in a shorter and more clear way. 我有一个想法,应该可以更短,更清晰的方式做到这一点。 It ended up not being all that much shorter and you can question if it's more clear. 它最终没有那么短,你可以质疑它是否更清楚。 At least it's another way to solve the problem. 至少这是解决问题的另一种方法。

var str = "Field1:abc;def;Field2:asd;fgh";
var rows = new List<Dictionary<string, string>>();
int index = 0;
string value;
string fieldname = "";

foreach (var s in str.Split(';'))
{
    if (s.Contains(":"))
    {
        index = 0;
        var tmp = s.Split(':');
        fieldname = tmp[0];
        value = tmp[1];
    }
    else
    {
        value = s;
        index++;
    }

    if (rows.Count < (index + 1))
        rows.Insert(index, new Dictionary<string, string>());

    rows[index][fieldname] = value;
}

var arr = rows.Select(dict => 
                   String.Join("," , dict.Select(kv => 
                       String.Format("\"{0}\":\"{1}\"", kv.Key, kv.Value))))
                   .Select(r => "{" + r + "}");
var json = String.Join(",", arr );
Debug.WriteLine(json);

Outputs: 输出:

{"Field1":"abc","Field2":"asd"},{"Field1":"def","Field2":"fgh"}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM