简体   繁体   中英

Recognize pattern to extract words from C# HTML Encoded String

I am looking for some help in recognizing pattern from a string that is HTML Encoded.

If I have an HTML Encoded string like:

string strHTMLText=@"<p>Pellentesque habitant [[@Code1]] morbi tristique senectus [[@Code2]] et netus et malesuada fames ac [[@Code3]] turpis egestas.</p>"

I need to extract the words [[@Code1]], [@Code2], [[@Code3]] , that is dynamic and their count is unknown. These words has been used to substitute other values in the provided HTML Text.

I want to recognize the pattern [[@ something ]] and populate all the occurrence in an array etc, so that I can process these values to fetch the relevant value from the database later.

string strHTMLText=@"<p>Pellentesque habitant [[@Code1]] morbi tristique senectus [[@Code2]] et netus et malesuada fames ac [[@Code3]] turpis egestas.</p>";
var input = HttpUtility.HtmlDecode(strHTMLText);
var list = Regex.Matches(input, @"\[\[@(.+?)\]\]")
    .Cast<Match>()
    .Select(m => m.Groups[1].Value)
    .ToList();

Until someone comes along with the regex solution, for fun I did this for you:

string strHTMLText=@"&lt;p&gt;Pellentesque habitant [[@Code1]] morbi tristique senectus [[@Code2]] et netus et malesuada fames ac [[@Code3]] turpis egestas.&lt;/p&gt;";

IEnumerable<string> arr = strHTMLText.Split(new char[] {'['};
List<string> output = new List<string>();
foreach(var item in arr)
{
string placeHolder = item.Substring(0,item.IndexOf("]");
output.Add(placeHolder);
}

To get the output into an array:

output.ToArray();

You can use regular expressions.

Try using this expression

Regex exp = new Regex("\[.+?\]")
MatchCollection mc = exp.matches(<Your string here>);
foreach(Match m in mc)
{
   String code = m.value;
}

I have not tested this code though and it is a quick and dirty pseudo code so please bear with me.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM