I have string with domain\\username in an array. I want to match it and replace it.
The string has following pattern:
[, DESKTOP-XXQYY56\Adminaccount, ] [, MB4345XX\adminaccount, ]
The code I am using is as follows:
df2= df1.withColumn(
'str1',
regexp_replace(
'str',
r'^([A-Za-z0-9]+(-[A-Za-z0-9]+)*)+(\\?([A-Za-z0-9])+)*',
'AB22'
)
)
I am not able to match the pattern correctly. I want to match the string and replace it. Please suggest.
If you want to match that format and replace the domain\\user\u003c/code> with XXXX you might use 2 capturing groups for the opening
[,
and closing , ]
You could omit the anchor
^
and in this part ([A-Za-z0-9])+
move the quantifier +
to the character class [A-Za-z0-9]+
or else you would repeat the group matching a single char.
If you are not using the capturing groups separately for further processing you could turn them into non capturing groups
(?:
The pattern might look like
(\[, )[A-Za-z0-9]+(?:-[A-Za-z0-9]+)*(?:\\?[A-Za-z0-9]+)*(, \])
In parts
(\\[, )
Capture group 1 match [,
[A-Za-z0-9]+
Match 1+ times any of the listed in the character class
(?:
Non capturing group
-
-[A-Za-z0-9]+
Match -
and match 1+ times any of the listed
)*
Close non capturing group and repeat 0+ times
(?:
Non capturing group
-
\\\\?[A-Za-z0-9]+
Match optional \\
and 1+ times any of the listed
)*
Close non capturing group and repeat 1+ times
(, \\])
Capture group 2 match , ]
In the replacement use the 2 capturing groups
$1XXXX$2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.