简体   繁体   中英

Removing a string in Python without removing repeating characters

I am printing a folder name to a text file containing data, and want to remove the external folders from the string. For example, it is printing C:\\A3200\\201808101040 , but I want to print 201808101040 . When I use str(os.getcwd().strip('C:\\\\A3200\\\\')) to remove the external folders from being printed, the program returns 180810104 , which is weird because some of the zeros are removed but some aren't, etc. (it removed the beginning 20 and the ending 0 .)

I know that this could be done by getting the folder name a different way than os.getcwd() , but I am interested in this method of string manipulation for the future.

How do I remove a certain string of characters within a full string without affecting the characters that are repeated later in the full string?

Strip takes a set of characters and removes from both sides until it encounters a character not in the set. This is why it eats your 2 and 0 but not the 1. You will probably have better luck with os.getcwd().split(os.sep)[-1]

That may work, but I would like to know for future reference how to just do it string-wise, in case I need to remove something else like "pear" from "pear tree", etc. where the "e" is in both words

You could do 'pear tree'.replace('pear', '', 1).strip()

a=r"C:\A3200\201808101040" # make sure you read it raw
a[a.rindex("\\")+1:] #'201808101040'

OR

In case you just need 'C:\\A3200' and '201808101040' seperated

a=r"C:\A3200\201808101040"
a.rsplit("\\",1)[1] #'201808101040'
a.rsplit("\\",1)[0] #'C:\A3200'

The answer to this specific question is employing os.path.basename() .

In regards to your more broad question: " How do I remove a certain string of characters within a full string without affecting the characters that are repeated later in the full string? "

I would consider using a regular expression (regex). This allows you to specify positive and negative look-aheads / look-behinds, and many other useful tricks. In your case here, I would consider searching the string instead of actually replacing any characters in the string. Here is a regex example for your question:

import re

s = r'C:\A3200\201808101040'

matches = re.findall(r'[0-9]+', s)

print(matches)

Yields:

['3200', '201808101040']

Obviously, in this case, you are interested in the final match returned in matches , therefore you can access this via matches[-1] , which gives 201808101040 .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM