简体   繁体   中英

Pandas bad escape %s" % escape, len(escape)

When executing the following line:

df = df[df['Directory'].str.contains("C:\Windows\System32\Tasks")]

I get the following error:

File "/Users/patrickmaynard/Desktop/CSVparser/parse.py", line 80, in parseFoundFiles
    df = df[df['Directory'].str.contains("C:\Windows\System32\Tasks")]
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/strings.py", line 1562, in contains
    regex=regex)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/strings.py", line 249, in str_contains
    regex = re.compile(pat, flags=flags)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/re.py", line 233, in compile
    return _compile(pattern, flags)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/re.py", line 301, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_compile.py", line 562, in compile
    p = sre_parse.parse(p, flags)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 856, in parse
    p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, False)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 415, in _parse_sub
    itemsappend(_parse(source, state, verbose))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 501, in _parse
    code = _escape(source, this, state)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 401, in _escape
    raise source.error("bad escape %s" % escape, len(escape))
sre_constants.error: bad escape \T at position 19

I have tried other file paths and they work just fine. I didn't include more code as I am certain that it has not effected this line. I believe this might be some weird glitch in pandas or regex, is this the case or have I made a mistake?

str.contains is trying to use a regex by default, and so \\T is trying to be read as a special character. You can tell it not to use regex, and search for your exact strings by saying regex=False :

df[df['Directory'].str.contains("C:\Windows\System32\Tasks", regex=False)]

Example:

>>> df
                       Directory
0  C:\Windows\System32\Tasks\123
1  C:\Windows\System32\Tasks\456
2                    C:\Windows\
3                            xyz

>>> df[df['Directory'].str.contains("C:\Windows\System32\Tasks", regex=False)]
                       Directory
0  C:\Windows\System32\Tasks\123
1  C:\Windows\System32\Tasks\456

You can follow the solution posted here :

You can also try replacing import re to import regex as re

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM