Pandas bad escape %s" % escape, len(escape)

Question

When executing the following line:

df = df[df['Directory'].str.contains("C:\Windows\System32\Tasks")]

I get the following error:

File "/Users/patrickmaynard/Desktop/CSVparser/parse.py", line 80, in parseFoundFiles
    df = df[df['Directory'].str.contains("C:\Windows\System32\Tasks")]
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/strings.py", line 1562, in contains
    regex=regex)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/strings.py", line 249, in str_contains
    regex = re.compile(pat, flags=flags)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/re.py", line 233, in compile
    return _compile(pattern, flags)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/re.py", line 301, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_compile.py", line 562, in compile
    p = sre_parse.parse(p, flags)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 856, in parse
    p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, False)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 415, in _parse_sub
    itemsappend(_parse(source, state, verbose))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 501, in _parse
    code = _escape(source, this, state)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 401, in _escape
    raise source.error("bad escape %s" % escape, len(escape))
sre_constants.error: bad escape \T at position 19

I have tried other file paths and they work just fine. I didn't include more code as I am certain that it has not effected this line. I believe this might be some weird glitch in pandas or regex, is this the case or have I made a mistake?

Answer 1

str.contains is trying to use a regex by default, and so \\T is trying to be read as a special character. You can tell it not to use regex, and search for your exact strings by saying regex=False :

df[df['Directory'].str.contains("C:\Windows\System32\Tasks", regex=False)]

Example:

>>> df
                       Directory
0  C:\Windows\System32\Tasks\123
1  C:\Windows\System32\Tasks\456
2                    C:\Windows\
3                            xyz

>>> df[df['Directory'].str.contains("C:\Windows\System32\Tasks", regex=False)]
                       Directory
0  C:\Windows\System32\Tasks\123
1  C:\Windows\System32\Tasks\456

Answer 2

You can follow the solution posted here :

You can also try replacing import re to import regex as re

Pandas bad escape %s" % escape, len(escape)

Question

2 answers

solution1
3 ACCPTED 2018-08-09 17:39:36

solution2
0 2021-03-29 11:42:42

Pandas bad escape %s" % escape, len(escape)

Question

2 answers

solution1 3 ACCPTED 2018-08-09 17:39:36

solution2 0 2021-03-29 11:42:42

solution1
3 ACCPTED 2018-08-09 17:39:36

solution2
0 2021-03-29 11:42:42