简体   繁体   English

解析文件名中反斜杠和正斜杠的混合

[英]Parsing a mix of Backward slash and forward slash in a filename

I am getting filename from an api in this format containing mix of / and \\ . 我正在以包含/\\混合格式从api获取文件名。

infilename = 'c:/mydir1/mydir2\\mydir3\\mydir4\\123xyz.csv' infilename ='c:/ mydir1 / mydir2 \\ mydir3 \\ mydir4 \\ 123xyz.csv'

When I try to parse the directory structure, \\ followed by a character is converted into single character. 当我尝试解析目录结构时,后跟一个字符的\\将转换为单个字符。

Is there a way around to get each component correctly? 有没有办法正确获取每个组件?

What I already tried: 我已经尝试过的

path.normpath didn't help. 

infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'
os.path.normpath(infilename)

out:
'c:\\mydir1\\mydir2\\mydir3\\mydir4Sxyz.csv'

that's not visible in your example but writing this: 在您的示例中不可见,但编写如下代码:

infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'

isn't a good idea because some of the lowercase (and a few uppercase) letters are interpreted as escape sequences if following an antislash. 这不是一个好主意,因为如果使用反斜杠,则某些小写(和一些大写)字母将被解释为转义序列。 Notorious examples are \\t , \\b , there are others. 臭名昭著的例子是\\t\\b ,还有其他。 For instance: 例如:

infilename = 'c:/mydir1/mydir2\thedir3\bigdir4\123xyz.csv'

doubly fails because 2 chars are interpreted as "tab" and "backspace". 双重失败,因为2个字符被解释为“ tab”和“ backspace”。

When dealing with literal Windows-style path (or regexes), you have to use the raw prefix, and better, normalize your path to get rid of the slashes. 在处理原义Windows风格的路径(或正则表达式)时,您必须使用原始前缀,更好的是规范化路径以消除斜线。

infilename = os.path.normpath(r'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv')

However, the raw prefix only applies to literals . 但是,原始前缀仅适用于文字 If the returned string appears, when printing repr(string) , as 'the\\terrible\\\\dir' , then tab chars have already been put in the string, and there's nothing you can do except a lousy post-processing. 如果出现返回的字符串,则在打印repr(string) ,以'the\\terrible\\\\dir' ,则制表符已经放入了字符串中,除了糟糕的后处理之外,您无能为力。

use r before the string to process it as a raw string (ie no string formatting). 在字符串之前使用r将其作为原始字符串处理(即,不格式化字符串)。

eg 例如

infilename = r'C:/blah/blah/blah.csv'

More details here: https://docs.python.org/3.6/reference/lexical_analysis.html#string-and-bytes-literals 此处有更多详细信息: https : //docs.python.org/3.6/reference/lexical_analysis.html#string-and-bytes-literals

Instead of parsing by \\ try parsing by \\\\ . 而不是通过\\解析,请尝试通过\\\\解析。 You usually have to escape by \\ so the \\ character is actually \\\\ . 通常,您必须使用\\进行转义,因此\\字符实际上是\\\\

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM