I am getting filename from an api in this format containing mix of /
and \\
.
infilename = 'c:/mydir1/mydir2\\mydir3\\mydir4\\123xyz.csv'
When I try to parse the directory structure, \\
followed by a character is converted into single character.
Is there a way around to get each component correctly?
What I already tried:
path.normpath didn't help.
infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'
os.path.normpath(infilename)
out:
'c:\\mydir1\\mydir2\\mydir3\\mydir4Sxyz.csv'
that's not visible in your example but writing this:
infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'
isn't a good idea because some of the lowercase (and a few uppercase) letters are interpreted as escape sequences if following an antislash. Notorious examples are \\t
, \\b
, there are others. For instance:
infilename = 'c:/mydir1/mydir2\thedir3\bigdir4\123xyz.csv'
doubly fails because 2 chars are interpreted as "tab" and "backspace".
When dealing with literal Windows-style path (or regexes), you have to use the raw prefix, and better, normalize your path to get rid of the slashes.
infilename = os.path.normpath(r'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv')
However, the raw prefix only applies to literals . If the returned string appears, when printing repr(string)
, as 'the\\terrible\\\\dir'
, then tab chars have already been put in the string, and there's nothing you can do except a lousy post-processing.
use r before the string to process it as a raw string (ie no string formatting).
eg
infilename = r'C:/blah/blah/blah.csv'
More details here: https://docs.python.org/3.6/reference/lexical_analysis.html#string-and-bytes-literals
Instead of parsing by \\
try parsing by \\\\
. You usually have to escape by \\
so the \\ character is actually \\\\
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.