简体   繁体   中英

C# Replace "Whole words only" using RegEx and dictionary

I would like to create code, to replace words contained in one file, using another text file as a dictionary (struct.: Key sep.:tab Value).

Current code:

var fileDictionary = new Dictionary<string, string>
   File.ReadLines(dictionaryPath, Encoding.Default)
  .Select(line => line.Split('  '))
  .ToDictionary(data => data[0], data => data[1]), StringComparer.InvariantCultureIgnoreCase);//create dictionary based on text file

for (int i = 0; i < rowNumber; i++)
{
   var output = fileString[i].ToString();// current row, taked from other file
   var replaced = Regex.Replace(output, String.Join("|", fileDictionary.Keys.Select(Regex.Escape)), m => fileDictionary[m.Value], RegexOptions.IgnoreCase);
   var result = replaced.ToString();
   outputFile += result.ToString();
   outputFile += "\r\n";
}

Until now, everything worked fine, I'm using RegEx to replace words collected in the dictionary, but I have a problem with replacing type "whole words only".

I decided to use pattern like @"\\bsomeword\\b" but when I implemented it as described below:

 var replaced = Regex.Replace(output, String.Join("|", 
         String.Format(@"\b{0}\b", 
         fileDictionary.Keys.Select(Regex.Escape))), 
         m => fileDictionary[m.Value], RegexOptions.IgnoreCase);

The code doesn't return any results. Final text file looks like the original file. Nothing happens. I realize, the problem is in dictionary key, when I am using the pattern I actually change key and the new one does not exist in the current dictionary. So if the key does not exist, the value is not replaced.

Does anybody any suggestions how to fix that? Or maybe somebody knows some other way to replace whole words only, using RegEx and dictionary?

It looks like the pattern wasn't parsing correctly from the dictionary

 var replaced = Regex.Replace(fileString, String.Join("|", fileDictionary.Select(m => @"\b" + Regex.Escape(m.Key) + @"\b")), m => fileDictionary[m.Value], RegexOptions.IgnoreCase);

Utilising a StringBuilder for your output would be more efficient also.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM