简体   繁体   中英

Is there a way to get VB.NET RegEx.Replace to use special characters in the REPLACEMENT argument?

I have VB.NET program that applies user supplied PATTERN and REPLACEMENT arguments to a collection of input strings using RegEx.Replace and special characters in the REPLACEMENT argument are not interpreted.

Is there a way to make RegEx.Replace interpret special characters in the REPLACEMENT string like it does in the PATTERN string? For example, treat "\\t" as a tab and "\\xAE" or "\®" as (R)?

In Linux, I get the correct output from sed

echo Test XXX Replacement | sed 's/XXX/\xAE/'

gives "Test ® Replacement"

But in VB it just gives me the special character pattern as a literal

Regex.Replace("Test XXX Replacement", "XXX", "\t")
Regex.Replace("Test XXX Replacement", "XXX", "\u00AE")

gives "Test \\t Replacement" and "Test \® Replacement" respectively

I've found 2 somewhat related but distinctly not applicable posts, my problem differs from Escape Regex.replace() replacement string in VB.net in that I actually want the special characters in my replacement strings.

It also differs from Regex VB.Net Regex.Replace , that question had control of the replacement string and dodged my issue by using a VB constant instead of a RegEx special character.

Are there any settings/options/utilities/methods that can make my (user supplied!) RegEx REPLACEMENT strings correctly handle special characters?

Is there a way to make RegEx.Replace interpret special characters in the REPLACEMENT string like it does in the PATTERN string? For example, treat "\\t" as a tab and "\\xAE" or "\®" as (R)?

You mean like the Regex.Unescape(String) Method ?

If you can accept the limitations declared in the Remarks Section :

  • It reverses the transformation performed by the Escape method by removing the escape character ("\\") from each character escaped by the method. These include the \\, *, +, ?, |, {, [, (,), ^, $, ., #, and white space characters. In addition, the Unescape method unescapes the closing bracket (]) and closing brace (}) characters.
  • It replaces the hexadecimal values in verbatim string literals with the actual printable characters. For example, it replaces @"\\x07" with "\\a", or @"\\x0A" with "\\n". It converts to supported escape characters such as \\a, \\b, \\e, \\n, \\r, \\f, \\t, \\v, and alphanumeric characters.

Regex.Unescape("\\xAE\\t\®") yields the string result of "®" & vbTab & "®"

VB.Net doesn't have escape characters.

According to the docs for the Replace method:

Substitutions are the only regular expression language elements that are recognized in a replacement pattern. All other regular expression language elements, including character escapes, are allowed in regular expression patterns only and are not recognized in replacement patterns.

The equivalent to your two lines of code would be:

Regex.Replace("Test XXX Replacement", "XXX", vbTab)
Regex.Replace("Test XXX Replacement", "XXX", ChrW(&H00AE))

You could also use string interpolation with the replacement string if you needed to embed a hex string or character in a longer replacement string:

Regex.Replace("Test XXX Replacement", "XXX", $"{vbTab} yyy {ChrW(&H00AE)}")

Be sure to import the Microsoft.VisualBasic namespace, if not already imported.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM