I'm dealing with text data and having problem erasing multiple back slashes. I found out that using.sub works quite well. So I coded as below to erase back slash+rntfv
temp_string = re.sub(r"[\t\n\r\f\v]"," ",string)
However, the code above can't deal with the string below.
string = '\\\\r \\\\nLove the filtered water and crushed ice in the door.'
So coded as this:
temp_string = re.sub(r"[\\\\t\\\\n\\\\r\\\\f\\\\v]"," ",string)
temp_string
But it's showing result like this..
I don't know why this happens.
Erasing all the v,f,n and so on..
I found out using .replace(“\\\\r”,” ”)
works, However,in this way. i should go like..
.replace(“\\\\r”,” ”)
.replace(“\\\r”,” ”)
.replace(“\\r”,” ”)
.replace(“\r”,” ”)
.replace(“\\\\t”,” ”)
…
I'm pretty sure there'd be better way..
You can't define a sequence of characters inside a character class . Character classes are meant to match a single character. So, [\\\\t\\\\n\\\\r\\\\f\\\\v]
is equal to [\\tnrfv]
and matches either a backslash, or t
, n
, r
, f
or v
letters.
To match a sequence of chars, you need to use them one by one. To match a \n
two-char string you need to use \\n
pattern ( r'\\n'
). If you need to match either \n
or \v
texts you would need to use either \\n|\\v
, (?:\\n|\\v)
or better \\[nv]
.
So, if you want to match a backslash followed with a letter from the rtnfv
char set, or "\t"
(TAB), "\n"
(line feed), "\r"
(carriage return), "\f"
(form feed) or "\v"
(vertical tab) chars you can use
r'\\[rtnfv]|[\t\n\r\f\v]'
r'(?:\\[rtnfv]|[\t\n\r\f\v])'
r'(?:\\[rtnfv]|[\t\n\r\f\v])+'
The last one matches one or more consecutive occurrences of the patterns that may be mixed with each other.
Since escape characters are not the same as characters with a backslash before them, you will need to define a mapping for the escape characters you want to replace.
string = '\\\\r \\\\\nLove the \nfiltered \\twater \\and crushed ice in the door.'
esc_map = {'\\n': '\n',
'\\t': '\t',
'\\r': '\r'}
# replace characters that should be escaped characters
for key, value in esc_map.items():
string = string.replace(key, value)
# group escape character that might have backslashes prefixed
re_str = r'\\*({})'.format(r'|'.join(esc_map.values()))
# remove extra backslashes
string = re.sub(re_str,r'\1',string)
# replace an escape character with a space
string = re.sub(re_str,r' ',string)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.