I am trying to replace a certain part of a match that a regex found. The relevant strings have the following format:
"<Random text>[Text1;Text2;....;TextN]<Random text>"
So basically there can be N Texts seperated by a ";" inside the brackets. My goal is to change the ";" into a "," (but only for the strings which are in this format) so that I can keep the ";" as a seperator for a CSV file. So the result should be:
"<Random text>[Text1,Text2,...,TextN]<Random text>"
I can match the relevant strings with something like
re.compile(r'\[".*?((;).*?){1,4}"\]')
but if I try to use the sub method it replaces the whole string.
I have searched stackoverflow and I am pretty sure that "capture groups" might be the solution but I am not really getting there. Can anyone help me?
I ONLY want to change the ";" in the ["Text1;...;TextN"]-parts of my text file.
Try this regex:
;(?=(?:(?!\[).)*])
Replace each match with a ,
Explanation:
;
- matches a ;
(?=(?:(?.\[).)*])
- makes sure that the above ;
is followed by a closing ]
somewhere later in the string but before any opening bracket [
(?=....)
- positive lookahead (?:(?.\[).)*
- 0+ occurrences of any character which does not start with [
]
- matches a ]
If you want to match a ;
before a closing ]
and not matching [
in between you could use:
;(?=[^[]*])
;
Match literally(?=
Positive lookahead, assert what is on the right is
[^[]*
Negated character class, match 0+ times any char except [
]
Match literally )
Close lookahead Note that this will also match if there is no leading [
If you also want to make sure that there is a leading [
you could make use of the PyPi regex module and use \G
and \K
to match a single ;
(?:\[(?=[^[\]]*])|\G(?!^))[^;[\]]*\K;
import regex
pattern = r"(?:\[(?=[^[\]]*])|\G(?!^))[^;[\]]*\K;"
test_str = ("[\"Text1;Text2;....;TextN\"];asjkdjksd;ajksdjksad[\"Text1;Text2;....;TextN\"]\n\n"
".[\"Text1;Text2\"]...long text...[\"Text1;Text2;Text3\"]....long text...[\"Text1;...;TextN\"]...long text...\n\n"
"I ONLY want to change the \";\" in the [\"Text1;...;TextN\"]")
result = regex.sub(pattern, ",", test_str)
print (result)
Output
["Text1,Text2,....,TextN"];asjkdjksd;ajksdjksad["Text1,Text2,....,TextN"]
.["Text1,Text2"]...long text...["Text1,Text2,Text3"]....long text...["Text1,...,TextN"]...long text...
I ONLY want to change the ";" in the ["Text1,...,TextN"]
You can try this code sample:
import re
x = 'anbhb["Text1;Text2;...;TextN"]nbgbyhuyg["Text1;Text2;...;TextN"][]nhj,kji,'
for i in range(len(x)):
if x[i] == '[' and x[i + 1] == '"':
while x[i+2] != '"':
list1 = list(x)
if x[i] == ';':
list1[i] = ','
x = ''.join(list1)
i = i + 1
print(x)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.