简体   繁体   中英

Regex removing empty spaces when using replace

My situation is not about removing empty spaces, but keeping them. I have this string >[database values] which I would like to find. I created this RegEx to find it then go in and remove the >, [, ] . The code below takes a string that is from a document. The first pattern looks for anything that is surrounded by >[some stuff] it then goes in and "removes" >, [, ]

  string decoded = "document in string format";
  string pattern = @">\[[A-z, /, \s]*\]";
  string pattern2 = @"[>, \[, \]]"; 
  Regex rgx = new Regex(pattern);
  Regex rgx2 = new Regex(pattern2);         
  foreach (Match match in rgx.Matches(decoded))
  {                     
    string replacedValue= rgx2.Replace(match.Value, "");
    Console.WriteLine(match.Value);
    Console.WriteLine(replacedValue);

What I am getting in first my Console.WriteLine is correct. So I would be getting things like >[123 sesame St] . But my second output shows that my replace removes not just the characters but the spaces so I would get something like this 123sesameSt . I don't see any space being replaced in my Regex . Am I forgetting something, perhaps it is implicitly in a replace?

The [Az, /, \\s] and [>, \\[, \\]] in your patterns are also looking for commas and spaces. Just list the characters without delimiting them, like this: [A-Za-z/\\s]

string pattern = @">\[[A-Za-z/\s]*\]";
string pattern2 = @"[>,\[\]]";

Edit to include Casimir's tip.

By defining [>, \\[, \\]] in pattern2 you define a character group consisting of single characters like > , , , , [ and every other character you listed in the square brackets. But I guess you don't want to match space and , . So if you don't want to match them leave them out like

string pattern2 = @"[>\[\]]";

Alternatively, you could use

string pattern2 = @"(>\[|\])";

Thereby, you either match >[ or ] which better expresses your intention.

After rereading your question (if I understand well) I realize that your two steps approach is useless. You only need one replacement using a capture group:

string pattern = @">\[([^]]*)]";
Regex rgx = new Regex(pattern);

string result = rgx.Replace(yourtext, "$1");

pattern details:

>\[         # literals: >[
(           # open the capture group 1
    [^]]*   # all that is not a ]
)           # close the capture group 1
]           # literal ]

the replacement string refers to the capture group 1 with $1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM