I am trying to write my own Format
code for time, this is a class project but the Format
is an added for myself to work more with C# Regex. So what I am trying to do is match certain characters.
W w : w = weeks. W weeks preceded by a leading zero if smaller than 10
D d : d = days. D days preceded by a leading zero if smaller than 10
G g : g = Military Hours: G hours preceded by a leading zero if smaller than 10
H h : h = Civilian Hours: H hours preceded by a leading zero...
m : m = minutes
s : s = seconds
So what I have the regex so far is this
(w|W)(?=\b)|(d|D)(?=\b)|(h|H|g|G)(?=\b)|(m)(?=\b)|(s)(?=\b)
(w|W) //match upper or lower W
(?=\b) //positive lookahead only match if not apart of a word boundary
With the s it's match all s
in the string so I'm lead to believe my regex is wrong of course. My problem is that I'm not sure how to do lookaheads and lookbehinds correctly. I basically only want the cases of characters I've supplied and only if they are by themselves OR escaped see examples below.
Format("w Weeks, D days, h:m:s");
//returns 7 Weeks, 04 days, 10:01:05
Format("[w] weeks [d] days H:m:s");
//returns [7] weeks [4] days 10:01:05
Format("w \Weeks D \days, h:m:s");
//returns 7 07eeks 04 4ays, 10:01:05
As you can see the last format with escaped w's and d's it still replaces them. Which is what I want. Again I'm not sure how to write the lookaheads and lookbehinds correctly .
I am using https://regex101.com/r/sL9cI2/1 regex101 here to test on. You can see it and what is going on. any suggestions please.
One thing about word boundaries is that they match an empty string . \\b
matches a position, not a character, where it has a word character on one side, and it doesn't have a word character on the other. Eg, in "This is an example"
, there are 8 positions matching \\b
:
|This| |is| |an| |example|
| ::: denotes a word boundary
To match words, the regex should check it has a word boundary on each side: \\bword\\b
(Notice there's no need for lookaheads here).
I basically only want the cases of characters I've supplied and only if they are by themselves OR escaped
Then you have 2 options to match:
\\bw\\b
The letter "w" as a word. \\\\w
a backslash (you need to escape backslashes in regex) followed by the letter w. Regex:
(\bw\b|\\w)
Moreover, looking at your attempts, I think you can use a character class to simplify the pattern.
Regex:
(\b[WwDdGgHhms]\b|\\[WwDdGgHhms])
Do note that this regex does not validate consecutive backslashes, which means we can't reliably specify a backslash in front of format code.
Using \\\\week
as an example, it is interpreted as \\
followed by week format code then literal string eek
, instead of literal \\
followed by literal string week
.
Use the following regex if you want to support such use case:
\G(?:[^\\]|\\.)*?(\b[WwDdGgHhms]\b|\\[WwDdGgHhms])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.