简体   繁体   中英

Remove sub-string from beginning of string

I have a string representing the full path to a file:

full_path = '/home/user/fold1/fold2/sub-fold/'

and I need to remove from this string its root path stored in a different variable:

root = '/home/user/fold1/'

The resulting path should thus look like:

new_path = 'fold2/sub-fold/'

The full path (and obviously the root path) keeps changing as my code runs through many files stored in many different locations.

This is the (non-existent) operation I'm after:

new_path = full_path - root

How can I do this?

for path manipulations, preferably use os.path :

import os
new_path = os.path.relpath(full_path, root)

FTR: the equivalent of - operator for strings is string.replace() , but as other people pointed out, it will replace all occurrences of the string, not just at the beginning:

 new_path = full_path.replace(root, '') 

If you trust that full_path does indeed begin with root, you can use a simple substring by index:

new_path = full_path[len(root):]

If you don't trust it, you can do an if-test first to check, and take appropriate action if it's not as expected.

You can either strip the beginning matching the length of the root (bgoldst's answer):

 path[len(root):]

But then you would not notice if that beginning would not match the root you expect. If, for instance, you have /bla/foo as root and /bar/zap/fong/tang as file, you would get /fong/tang as a result, effectively cloaking the former bug. I would not suggest to do that.

Replacing the string root in the given path rigorously (Aprillion's) could replace later occurrences as well, effectively returning nonsense, as the comments pointed out.

I would suggest to replace the beginning of the string properly:

import re

result = re.sub(r'^' + re.escape(root), '', path)

This way you avoid both traps.

You might also want to consider just using os.path.relpath() which strips a given start from a path according to file system logic.

In any case you should consider how your program should behave in the case that the given root does not match the beginning of the path. The re solution I proposed will then just not change the given path. In most cases this will be a useful behavior but certainly not in all cases.

An addition to https://stackoverflow.com/a/27208635/6769234

The number of occurence to replace can be controlled with with the 3rd arg:

"bbb_ccc_ddd_bbb_eee_bbb".replace("bbb", "", 1) # '_ccc_ddd_bbb_eee_bbb' "bbb_ccc_ddd_bbb_eee_bbb".replace("bbb", "", 2) # '_ccc_ddd__eee_bbb'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM