简体   繁体   中英

Regex - remove text while replacing text with c#

I am attempting to learn regex by using it to edit some scripts I have.

My scripts contain like so

<person name="John">Will be out of town</person><person name="Julie">Will be in town.</person>

I need to replace the name values in the script - the addition to the name is always the same, but I might have names that I don't want to update.

Quick example of what I have:

string[] names = new string[1];
names[0] = "John-Example";
names[1] = "Paul-Example";

string ToFix = "<person name=\"John\">Will be out of town</person><person name=\"Julie\">Will be in town.</person>"

for (int i=0; i<names.Length; i++)
{
    string Name = names[i];
    ToFix = Regex.Replace(ToFix, "(<.*name=\")(" + Name.Replace("-Example", "") + ".*)(\".*>)", "$1" + Name + "$3", RegexOptions.IgnoreCase);
}

This works for the most part, but I have two problems with it. Sometime it removes too much, if I have multiple persons in the string, it will remove everything between the first person and the last person, as so:

Hello <person name="John">This is John</person><person name="Paul">This is Paul</person>

becomes

Hello <person name="John-Example">This is Paul</person>

Also, I would like to remove any extra text behind the name value and before the closing carrat, so that:

<person name="John" hello>

Should be corrected to:

<person name="John-Example">

I have read several articles on regex and feel that I am just missing something small here. How and why would I go about fixing this?

EDIT: I don't think these scripts that I am working with classify as XML - the entire script may or may not have <> tags. Back to my original goal with this question, can someone explain the behavior of the regex? And how would I remove extra text after the name value before the closing tag?

Your regex is too greedy. Try .*? rather than just .*

Also, please don't use regex to parse XML.


Here's an example of how to do what I think you want, using XDocument :

var xdoc = XDocument.Parse(ToFix);
foreach (var person in xdoc.Elements("person"))
{
    var name = person.Attribute("name");
    if (person.LastAttribute != name)
    {
        person.RemoveAttributes();
        person.SetAttributeValue(name.Name, name.Value + "-Example");
    }
}
var output = xdoc.ToString();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM