使用ASP.NET正則表達式刪除有條件的重復項

Question

我正在搜索使用正則表達式或類似工具刪除文檔中的重復項； 刪除以下內容：

First Line

<Important text /><Important text />Other random words

我需要刪除<some text/>的重復項，並使其他所有內容保持原樣。 文本可以或可以不在多行上。

它將需要處理幾個不同的詞，但要使用<>標記。

編輯：

我不知道這句話是什么。 有些將嵌套在<>標記內，有些則不會。 我將需要刪除所有重復的內容，例如：

<text/><text/><words/><words/><words/>

輸出應為：

<text/><words/>

Answer 1

此正則表達式將搜索重復的標簽(<.+?\\/>)(?=\\1) ，這是一個正則表達式101進行證明。

Answer 2

您可以使用此：

Regex.Replace(input, "(<Important text />)+", "<Important text />");

這將替換的任何實例<Important text />用的單個實例重復一次或多次<Important text />

或更簡單地說：

Regex.Replace(input, "(<Important text />)+", "$1");

例如：

var input = "<Important text /><Important text />Other random words";
var output = Regex.Replace(input, "(<Important text />)+", "$1");

Console.WriteLine(output); // <Important text />Other random words

如果您想一次處理多個這樣的模式，則應使用替換（ | ），指定要處理的每個單詞，以及向后引用（ \\1 ）以查找重復：

Regex.Replace(input, @"(<(?:Important text|Other text) />)\1+", "$1");

例如：

var input = "<text/><text/><words/><words/><words/>";
var output = Regex.Replace(input, @"(<(?:text|words)\s*/>)\1+", "$1");

Console.WriteLine(output); // <text/><words/>

Answer 3

您應該創建所有標簽的字典，即<和/>之間的所有文本（包括方括號）及其計數（可以使用正則表達式來完成）。 然后再次遍歷，刪除重復項或不將其輸出到新的字符串/數據結構。

Answer 4

就個人而言，我不喜歡帶有標簽的正則表達式。

分割每個標簽上的文本，使用Distinct刪除重復項，將結果加入並保留。

string input1 = "<Important text /><Important text />Other random words";
string input2 = "<text/><text/><words/><words/><words/>";

string result1 = RemoveDuplicateTags(input1); // "<Important text />Other random words"
string result2 = RemoveDuplicateTags(input2); // "<text/><words/>"

private string RemoveDuplicateTags(string input)
{
    IEnumerable<string> tagsOrRandomWords = input.Split('>');
    tagsOrRandomWords = tagsOrRandomWords.Distinct();

    return string.Join(">", tagsOrRandomWords);
}

或者，如果您更喜歡可讀性較低的一線紙：

private string RemoveDuplicateTags(string input)
{
    return string.Join(">", input.Split('>').Distinct());
}

使用ASP.NET正則表達式刪除有條件的重復項

問題描述

4 個解決方案

解決方案1
1 已采納 2013-08-19 17:32:51

解決方案2
0 2013-08-19 17:19:51

解決方案3
0 2013-08-19 17:22:45

解決方案4
0 2013-08-19 17:35:49

使用ASP.NET正則表達式刪除有條件的重復項

問題描述

4 個解決方案

解決方案1 1 已采納 2013-08-19 17:32:51

解決方案2 0 2013-08-19 17:19:51

解決方案3 0 2013-08-19 17:22:45

解決方案4 0 2013-08-19 17:35:49

解決方案1
1 已采納 2013-08-19 17:32:51

解決方案2
0 2013-08-19 17:19:51

解決方案3
0 2013-08-19 17:22:45

解決方案4
0 2013-08-19 17:35:49