简体   繁体   中英

Escape Windows's Path Delimiter

I need to change this string by escaping the windows path delimiters. I don't define the original string myself, so I can't pre-pend the raw string 'r'.

I need this:

s = 'C:\foo\bar'

to be this:

s = 'C:\\foo\\bar'

Everything I can find here and elsewhere says to do this:

s.replace( r'\\', r'\\\\' )

(Why I should have to escape the character inside a raw string I can't imagine)

But printing the string results in this. Obviously something has decided to re-interpret the escapes in the modified string:

C:♀oar

This would be so simple in Perl. How do I solve this in Python?

After a bunch of questions back and forth, the actual problem is this:

You have a file with contents like this:

C:\foo\bar
C:\spam\eggs

You want to read the contents of that file, and use it as pathnames, and you want to know how to escape things.

The answer is that you don't have to do anything at all.

Backslash sequences are processed in string literals , not in string objects that you read from a file, or from input (in 3.x; in 2.x that's raw_input ), etc. So, you don't need to escape those backslash sequences.

If you think about it, you don't need to add quotes around a string to turn it into a string. And this is exactly the same case. The quotes and the escaping backslashes are both part of the string's representation , not the string itself.


In other words, if you save that example file as paths.txt , and you run the following code:

with open('paths.txt') as f:
    file_paths = [line.strip() for line in f]
literal_paths = [r'C:\foo\bar', r'C:\spam\eggs']
print(file_paths == literal_paths)

… it will print out True .


Of course if your file was generated incorrectly and is full of garbage like this:

C:♀oar

Then there is no way to "escape the backslashes", because they're not there to escape. You can try to write heuristic code to reconstruct the original data that was supposed to be there, but that's the best you can do.

For example, you could do something like this:

backslash_map = { '\a': r'\a', '\b': r'\b', '\f': r'\f', 
                  '\n': r'\n', '\r': r'\r', '\t': r'\t', '\v': r'\v' }
def reconstruct_broken_string(s):
    for key, value in backslash_map.items():
        s = s.replace(key, value)
    return s

But this won't help if there were any hex, octal, or Unicode escape sequences to undo. For example, 'C:\\foo\\x02' and 'C:\\foo\\b' both represent the exact same string, so if you get that string, there's no way to know which one you're supposed to convert to. That's why the best you can do is a heuristic.

Don't do s.replace(anything) . Just stick an r in front of the string literal, before the opening quote, so you have a raw string. Anything based on string replacement would be a horrible kludge, since s doesn't actually have backslashes in it; your code has backslashes in it, but those don't become backslashes in the actual string.

If the string actually has backslashes in it, and you want the string to have two backslashes wherever there once was one, you want this:

s = s.replace('\\', r'\\')

That'll replace any single backslash with two backslashes. If the string literally appears in the source code as s = 'C:\\foo\\bar' , though, the only reasonable solution is to change that line. It's broken, and anything you do to the rest of the code won't make it not broken.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM