简体   繁体   中英

RegEx.Replace is replacing more than the matched

I'm trying to replace the following match a pattenr in xml string where the pattern is various types of attributes that are present in any given xml element.

so if the xml string was:

<TEST xlmns="https://www.test.com">
    <XXX>Foo</XXX>
    <YYY>Bar</YYY>
</TEST>

I want to remove the namespaces above using pattenr .*?(?:[az][a-z0-9_]*).*?((?:[az][a-z0-9_]*))(=)(\\".*?\\") in the below code:

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            var txt = "<TEST xlmns=\"https://www.test.com\"> <XXX>Foo</XXX> <YYY>Bar</YYY> </TEST>";

            const string pattern = ".*?(?:[a-z][a-z0-9_]*).*?((?:[a-z][a-z0-9_]*))(=)(\".*?\")";    

            var r = new Regex(pattern, RegexOptions.IgnoreCase | RegexOptions.Singleline);
            var m = r.Match(txt);
            if (m.Success)
            {
                String var1 = m.Groups[1].ToString();
                String c1 = m.Groups[2].ToString();
                String string1 = m.Groups[3].ToString();
                Console.Write( var1.ToString() +  c1.ToString() + string1.ToString()  + "\n");
                Console.WriteLine(RegExReplace(txt,pattern,""));
            }
            Console.ReadLine();
        }

        static String RegExReplace(String input, String pattern, String replacement)
        {
            if (string.IsNullOrEmpty(input))
                return input;

            return Regex.Replace(input, pattern, replacement, RegexOptions.IgnoreCase);
        }
    }
}

But where it matches, in this case <TEST xlmns="https://www.test.com"> is turned into > when it should have been <TEST>

What have i done wrong in the replace method?

If you just want to remove namespace, change your regex to:

const string pattern = "xlmns=\".*\"";

If you want to remove all attributes, use the given regex:

const string pattern = "\w+=\".*\"";

Full code:

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            var txt = "<TEST xlmns=\"https://www.test.com\"> <XXX>Foo</XXX> <YYY>Bar</YYY> </TEST>";

            const string pattern = "\w+=\".*\"";    

            var r = new Regex(pattern, RegexOptions.IgnoreCase | RegexOptions.Singleline);
            var m = r.Match(txt);
            if (m.Success)
            {
                String var1 = m.Groups[1].ToString();
                String c1 = m.Groups[2].ToString();
                String string1 = m.Groups[3].ToString();
                Console.Write( var1.ToString() +  c1.ToString() + string1.ToString()  + "\n");
                Console.WriteLine(RegExReplace(txt,pattern,""));
            }
            Console.ReadLine();
        }

        static String RegExReplace(String input, String pattern, String replacement)
        {
            if (string.IsNullOrEmpty(input))
                return input;

            return Regex.Replace(input, pattern, replacement, RegexOptions.IgnoreCase);
        }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM