简体   繁体   中英

Excluding a specific string of characters in a str()-function

A small issue I've encountered during coding.

I'm looking to print out the name of a .txt file. For example, the file is named: verdata_florida.txt, or verdata_newyork.txt How can I exclude .txt and verdata_, but keep the string between? It must work for any number of characters, but .txt and verdata_ must be excluded.

This is where I am so far, I've already defined filename to be input()

print("Average TAM at", str(filename[8:**????**]), "is higher than ")

3 ways of doing it:

using str.split twice:

>>> "verdata_florida.txt".split("_")[1].split(".")[0]
'florida'

using str.partition twice (you won't get an exception if the format doesn't match, and probably faster too):

>>> "verdata_florida.txt".partition("_")[2].partition(".")[0]
'florida'

using re , keeping only center part:

>>> import re
>>> re.sub(".*_(.*)\..*",r"\1","verdata_florida.txt")
'florida'

all those above must be tuned if _ and . appear multiple times (must we keep the longest or the shortest string)

EDIT: In your case, though, prefixes & suffixes seem fixed. In that case, just use str.replace twice:

>>> "verdata_florida.txt".replace("verdata_","").replace(".txt","")
'florida'

Assuming you want it to split on the first _ and the last . you can use slicing and the index and rindex functions to get this done. These functions will search for the first occurrence of the substring in the parenthesis and return the index number. If no substring is found, they will throw a ValueError . If the search is desired, but not the ValueError , you can also use find and rfind , which do the same thing but always return -1 if no match is found.

s = 'verdata_new_hampshire.txt'
s_trunc = s[s.index('_') + 1: s.rindex('.')]  # or s[s.find('_') + 1: s.rfind('.')]

print(s_trunc)  # new_hampshire

Of course, if you are always going to exclude verdata_ and .txt you could always hardcode the slice as well.

print(s[8:-4])  # new_hampshire

You can just split string by dot and underscore like this:

string filename = "verdata_prague.txt";
string name = filename.split("."); //verdata_prague
name = name[0].split("_")[1]; //prague

or by replace function:

string filename = "verdata_prague.txt";
string name = filename.replace(".txt",""); //verdata_prague
name = name[0].replace("verdata_","")[1]; //prague

You can leverage str.split() on strings. For example:

s = 'verdata_newyork.txt'

s.split('verdata_')
# ['', 'florida.txt']

s.split('verdata_')[1]
# 'florida.txt'

s.split('verdata_')[1].split('.txt')
['florida', '']

s.split('verdata_')[1].split('.txt')[0]
# 'florida'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM