简体   繁体   中英

Regex - How to replace each of those digits with one character without erasing any character just around them?

I have need to scrub out numeric data from some xml-like strings that are being logged to a minimally secure logging tool. The data logged may contain some personal information like Phone or Date of Birth, or may contain some financial information like income or other.

The data is in xml format, but is not guaranteed to be well-formed. But the data I'm scrubbing would be the xml data that would be between the tags.

For Instance:

<DOB>1/3/1960</DOB>
<DOB>12/04/1970</DOB>
<DOB>January 4 1988</DOB>

What we would like to do, is scrub all the numeric values from some data. So our result sets would look like:

<DOB>N/N/NNNN</DOB>
<DOB>NN/NN/NNNN</DOB>
<DOB>January N NNNN</DOB>

This would help in identifying issues with the calls made with the data, so we could see (for instance) that a phone number contained 9 numbers instead of 10, or DOB year only had 3 digits.

So far, I have this

Regex.Replace(xmlIn, @"(?<=<DOB>)\d+(?=</DOB>)", "N"

But this only works with numeric-only entries, like all 7 digits of phone number crammed together.

Try this regex:

(?<=<DOB>(?:\d*/?\d*/?\d*|(?:January|February|...)\s+\d+\s+\d*))\d(?=.*?</DOB>)

Description

正则表达式可视化

Demo

http://regexhero.net/tester/?id=989d3c5c-4bc2-4604-89ba-b5f89f7cd7a9

Sample code

Regex regex = new Regex(
  "(?<=<DOB>(?:\\d*/?\\d*/?\\d*|(?:January|February|...)\\s+\\d+\\s+\\d*))\\d(?=.*?</DOB>)",
  RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled
);

string regexReplace = "N";

string result = regex.Replace(InputText,regexReplace);

Sample Input

Given Examples
===============
<DOB>1/3/1960</DOB>
<DOB>12/04/1970</DOB>
<DOB>January 4 1988</DOB>

Additional tests
=================
<!-- This is a valid number: 99 -->
<DOB>14523  <-- Closing DOB tag is missing here...

Sample output

Given Examples
===============
<DOB>N/N/NNNN</DOB>
<DOB>NN/NN/NNNN</DOB>
<DOB>January 4 NNNN</DOB>

Additional tests
=================
<!-- This is a valid number: 99 -->
<DOB>14523  <-- Closing DOB tag is missing here...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM