简体   繁体   中英

in python find index in list if combination of strings exist

I'm writing my first script and trying to learn python. But I'm stuck and can't get out of this one.

I'm writing a script to change file names.

Lets say I have a string = "this.is.tEst3.E00.erfeh.ervwer.vwtrt.rvwrv"

I want the result to be string = "This Is Test3 E00"

this is what I have so far:

l = list(string) 

//Transform the string into list

for i in l:
    if "E" in l:
        p = l.index("E")
        if isinstance((p+1), int () is True:
            if isinstance((p+2), int () is True:
                delp = p+3
                a = p-3
                del l[delp:]

new = "".join(l)
new = new.replace("."," ")
print (new)

get in index where "E" and check if after "E" there are 2 integers. Then delete everything after the second integer.

However this will not work if there is an "E" anyplace else.

at the moment the result I get is:

this is tEst

because it is finding index for the first "E" on the list and deleting everything after index+3

I guess my question is how do I get the index in the list if a combination of strings exists.

but I can't seem to find how.

thanks for everyone answers. I was going in other direction but it is also not working. if someone could see why it would be awesome. It is much better to learn by doing then just coping what others write :)

this is what I came up with:

for i in l:

 if i=="E" and isinstance((i+1), int ) is True: p = l.index(i) print (p) 

anyone can tell me why this isn't working. I get an error.

Thank you so much

Have you ever heard of a Regular Expression?

Check out python's re module. Link to the Docs .

Basically, you can define a "regex" that would match "E and then two integers" and give you the index of it.

After that, I'd just use python's "Slice Notation" to choose the piece of the string that you want to keep.

Then, check out the string methods for str.replace to swap the periods for spaces, and str.title to put them in Title Case

An easy way is to use a regex to find up until the E followed by 2 digits criteria, with s as your string:

import re
up_until = re.match('(.*?E\d{2})', s).group(1)
# this.is.tEst3.E00

Then, we replace the . with a space and then title case it:

output = up_until.replace('.', ' ').title()
# This Is Test3 E00

The technique to consider using is Regular Expressions. They allow you to search for a pattern of text in a string, rather than a specific character or substring. Regular Expressions have a bit of a tough learning curve, but are invaluable to learn and you can use them in many languages, not just in Python. Here is the Python resource for how Regular Expressions are implemented:

http://docs.python.org/2/library/re.html

The pattern you are looking to match in your case is an "E" followed by two digits. In Regular Expressions (usually shortened to "regex" or "regexp"), that pattern looks like this:

E\d\d # ('\d' is the specifier for any digit 0-9)

In Python, you create a string of the regex pattern you want to match, and pass that and your file name string into the search() method of the the re module. Regex patterns tend to use a lot of special characters, so it's common in Python to prepend the regex pattern string with 'r', which tells the Python interpreter not to interpret the special characters as escape characters. All of this together looks like this:

import re
filename = 'this.is.tEst3.E00.erfeh.ervwer.vwtrt.rvwrv'
match_object = re.search(r'E\d\d', filename)
if match_object:
    # The '0' means we want the first match found
    index_of_Exx = match_object.end(0)
    truncated_filename = filename[:index_of_Exx]
    # Now take care of any more processing

Regular expressions can get very detailed (and complex). In fact, you can probably accomplish your entire task of fully changing the file name using a single regex that's correctly put together. But since I don't know the full details about what sorts of weird file names might come into your program, I can't go any further than this. I will add one more piece of information: if the 'E' could possibly be lower-case, then you want to add a flag as a third argument to your pattern search which indicates case-insensitive matching. That flag is 're.I' and your search() method would look like this:

match_object = re.search(r'E\d\d', filename, re.I)

Read the documentation on Python's 're' module for more information, and you can find many great tutorials online, such as this one:

http://www.zytrax.com/tech/web/regex.htm

And before you know it you'll be a superhero . :-)

The reason why this isn't working:

for i in l:

    if i=="E" and isinstance((i+1), int ) is True:
        p = l.index(i)
        print (p)

...is because 'i' contains a character from the string 'l', not an integer. You compare it with 'E' (which works), but then try to add 1 to it, which errors out.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM