简体   繁体   English

C# 中的字符串模式搜索和替换

[英]String pattern search and replace in C#

I have a situation that a text file has lot of strings like shown below.我有一个文本文件有很多字符串的情况,如下所示。 I need to search for these pattern and replace the source and column codes with values.我需要搜索这些模式并用值替换源代码和列代码。 How can we do this string pattern search and replace in c# please?请问如何在 c# 中进行这种字符串模式搜索和替换? Thanks.谢谢。

actual text: "anytext[ Source1 ].[anytext: Column1 :anytext]anytext"实际文本:“anytext[ Source1 ].[anytext: Column1 :anytext]anytext”

updated text: "anytext[ ABC ].[anytext: Col1 :anytext]anytext"更新文本:“anytext[ ABC ].[anytext: Col1 :anytext]anytext”

The code and value combinations look like below.代码和值组合如下所示。

SourceCode ColumnCode Sourcevalue ColumnValue SourceCode ColumnCode 源值 ColumnValue

====== ======== ========== ========== ====== =============================

Source1 Column1 ABC Col1 Source1 Column1 ABC Col1

Source2 Column2 DEF Col2 Source2 Column2 DEF Col2

Source3 Column3 GHI Col3 Source3 Column3 GHI Col3

I used two separate dictionaries to associate the source and column fields, as I thought the associations are between source and column fields only.我使用两个单独的字典来关联源字段和列字段,因为我认为关联仅在源字段和列字段之间。 The sample code is made for a button that replaces the text of a label when it is clicked, but it can be adapted to any similar situation.示例代码是为一个按钮制作的,当它被单击时替换 label 的文本,但它可以适应任何类似的情况。 So far this is what I came up with:到目前为止,这是我想出的:

using System;
using System.Collections.Generic;
using System.Text;
using System.Text.RegularExpressions;
using System.Windows.Forms;

namespace RegexTest
{

public partial class Form1 : Form
{
    Dictionary<string, string> values = new Dictionary<string, string>();
    Dictionary<string, string> columns = new Dictionary<string, string>();
    public Form1()
    {
        InitializeComponent();
        InitValues();
    }

    private void InitValues()
    {
        values.Add("Source1", "ABC");
        values.Add("Source2", "DEF");
        values.Add("Source3", "GHI");

        columns.Add("Column1", "Col1");
        columns.Add("Column2", "Col2");
        columns.Add("Column3", "Col3");
    }

    private void button1_Click(object sender, EventArgs e)
    {

        // Create the pattern
        string pattern = "[a-z1-9]+\\[Source[0-9]+\\]\\.\\[[a-z1-9]+:Column[0-9]+:[a-z1-9]+\\][a-z1-9]+";
        // Create a Regex  
        Regex rg = new Regex(pattern);
        // Get all matches  
        MatchCollection matchedValues = rg.Matches(label1.Text);

        StringBuilder sb = new StringBuilder();
        // Replace all matches 
        for (int count = 0; count < matchedValues.Count; count++)
        {          
            //copy the anytext part until the source
            sb.Append(matchedValues[count].Value.Substring(0, matchedValues[count].Value.IndexOf('[')));
            //replace the Source parts
            sb.Append(values[matchedValues[count].Value.Substring(matchedValues[count].Value.IndexOf('[') + 1,
                matchedValues[count].Value.IndexOf(']') - matchedValues[count].Value.IndexOf('['))]);
            //now copy in the same way the anytext after source
            //split in the same way around the : and use the columns dictionary

            //finally, replace the original string with the value from string builder
            label1.Text = sb.ToString();
            sb.Clear();
        }
    }
}
}

The other parts are done in a similar way (I only made it find the first part, the "source", for the column part it is the same).其他部分以类似的方式完成(我只让它找到第一部分,“源”,对于列部分它是相同的)。 If you'll need further help please ask and I'll answer as soon as possible.如果您需要进一步的帮助,请询问,我会尽快回复。 I also assumed that the anytext parts can contain only alpha-numerical text, if other characters can be found there, I'll edit the regex pattern.我还假设 anytext 部分只能包含字母数字文本,如果在那里可以找到其他字符,我将编辑正则表达式模式。

I wouldn't provide a complete working code that you copy & paste it without learning.我不会提供完整的工作代码,您无需学习即可复制和粘贴它。 Instead, I'll explain what you need to do step by step so you'll be able to write the code yourself.相反,我将逐步解释您需要做什么,以便您能够自己编写代码。 Remember, Stackoverflow isn't a code writing service.请记住,Stackoverflow 不是代码编写服务。

The solution provided here is based on your comment:此处提供的解决方案基于您的评论:

the column code (ex Column1) can appear for more than one source code.列代码(例如 Column1)可以出现在多个源代码中。

  1. Create a dictionary, let the key be a tuple that holds SourceCode and ColumnCode , and the value be a tuple that holds SourceValue and ColumnValue .创建一个字典,让键是一个包含SourceCodeColumnCode的元组,值是一个包含SourceValueColumnValue的元组。

  2. Assuming that each line of the file is always in the format of SourceCode ColumnCode Sourcevalue ColumnValue , I'd read the file line by line, split it to an array of four strings (let's call the array splitted ), Add tuple (splitted[0], splitted[1]) (key) and (splitted[2], splitted[3] (value) to the dictionary.假设文件的每一行总是采用SourceCode ColumnCode Sourcevalue ColumnValue的格式,我会逐行读取文件,将其拆分为四个字符串的数组(我们称之为数组splitted ),添加元组(splitted[0], splitted[1]) (key) 和(splitted[2], splitted[3] (value) 到字典中。

  3. Now, you have a dictionary representing the file content with O(1) access.现在,您有一个字典,表示具有 O(1) 访问权限的文件内容。

  4. Let's make the second assumption that your input string is in the format anytext[Source1].[anytext:Column1:anytext]anytext .让我们做出第二个假设,即您的输入字符串的格式为anytext[Source1].[anytext:Column1:anytext]anytext I'd use Regex to get Source1 and Column1 from the string, then get the corresponding values from the dictionary.我会使用 Regex 从字符串中获取Source1Column1 ,然后从字典中获取相应的值。 And finally do the replace.最后进行更换。

Just posting the final code that I got it working with the approach suggested by @Youssef13只需发布我使用@Youssef13 建议的方法即可使用的最终代码

Dictionary<Tuple<string, string>,Tuple<string,string>> sourcecolumncodeandvalue = new Dictionary<Tuple<string, string>, Tuple<string, string>>();
            sourcecolumncodeandvalue.Add(Tuple.Create("Source1", "Column1"), Tuple.Create("ABC", "Col1"));
            sourcecolumncodeandvalue.Add(Tuple.Create("Source2", "Column2"), Tuple.Create("DEF", "Col2"));

            Dictionary<string, string> codeandvaluereplacementlist = new Dictionary<string, string>();

            var pattern = @"\[(.*?)\]\.\[(.*?)\]";
            var filetext = "anytext[Source1].[anytext:Column1:anytext]anytext anytext[Source2].[anytext:Column2:anytext]anytext";
            var matchesfound = System.Text.RegularExpressions.Regex.Matches(filetext, pattern); //find the pattern [].[]
            foreach (System.Text.RegularExpressions.Match  m in matchesfound)
            {
                string datasource = string.Empty;
                string columnname = string.Empty;
                string replacementtext = string.Empty;

                string[] sourceandcolumnsplit = m.Value.ToString().Split('.');//split [].[] into two based on '.' character
                datasource = sourceandcolumnsplit[0].Replace("[","").Replace("]",""); //remove square brackets               
                //Column value is in between ':' character (ex: anytext:Column2:anytext)  so split it further 
                string[] columnsplit = sourceandcolumnsplit[1].Split(':');
                columnname = columnsplit[1];
                //We got the source and column codes, now get corresponding values from the dictionary
                Tuple<string,string> sourceandcolumnvalues;
                sourcecolumncodeandvalue.TryGetValue(Tuple.Create(datasource, columnname),out sourceandcolumnvalues);

                //construct the replacement value string for each code string
                codeandvaluereplacementlist.Add(m.Value.ToString(), "[" + sourceandcolumnvalues.Item1 + "]." + columnsplit[0] + ":" + sourceandcolumnvalues.Item2 + ":" + columnsplit[2]);
            }
            //Finally loop through all code matches and replace with values in the file text
            foreach (var codeandvalue in codeandvaluereplacementlist)
            {
                filetext = filetext.Replace(codeandvalue.Key, codeandvalue.Value);
            }
var source = "anytext[Source1].[anytext:Column1:anytext]anytext";
var src1 = "Source1";
var dest1 = "ABC";
var src2 = "Column1";
var dest2 = "Col1";

var result = source
                .Replace("[" + src1 + "]", "[" + dest1 +"]")
                .Replace(":" + src2 + ":", ":" + dest2 +":");

https://dotnetfiddle.net/5cRnYD https://dotnetfiddle.net/5cRnYD

Of course you may use any list/dictionary/file for src and dest values.当然,您可以将任何列表/字典/文件用于 src 和 dest 值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM