简体   繁体   中英

How to extract desired pattern out of line (string)

I am trying to compare my pattern with given string ( in general I will readline out of file, but for now I use explicit string just to see how it works ) though for given line script does not work as I desire.

import re

regex = '.+0+[0-9]+.'
string = "Your order number is 0000122995"

print (re.match(regex,string))

What I am trying to achieve here is to find this 0000* number and assign it to the variable ( which I would like to place into Excel later ), but given regex matches the whole line, which is not what I am trying to get ( I know that is because of the syntax ). Any tips how to overcome this?

If you want to locate a match anywhere in a string , use re.search() instead of re.match() . re.match() checks for a match only at the beginning of the string, while re.search() checks for a match anywhere in the string.

import re
regex = r'(0{4}\d+)'
string = "Your order number is 0000122995"

print (re.search(regex, string).group(0))

re.search() and re.match() return a match object if there is a match. Using match.group() returns one or more subgroups of the match.

See the re.search() documentation for more information.

In your case, if you expect your queries to be as consistent as you've shown the following will work(It ignores "Your order number is " and captures everything behind it until it hits whitespace or the end of the string):

def findOrder():
        import re
        string = "Your order number is 0000122995"
        arrayAnswer = re.findall('Your order number is ([\S]+)', string)
        print('Your number in an Array is:')
        print(arrayAnswer)
        print('')
        print('Your number(s) output as a "string(s)" is/are:')
        for order in arrayAnswer:
                print(order)

.

Run this by making sure to call findOrder(). If you wan to get a little more "regexy", noting that what you want exclusively includes numbers, the below excludes letters and spaces and returns numbers:

def findOrder():
        import re
        string = "Your order number is 0000122995"
        arrayAnswer = re.findall('[a-zA-Z\s]+([\d]+)', string)
        print('Your number in an Array is:')
        print(arrayAnswer)
        print('')
        print('Your number(s) output as a "string(s)" is/are:')
        for order in arrayAnswer:
                print(order)

Again, run this by making sure to call findOrder().

Your OUTPUT for both should be this:

>>> findOrder()
Your number in an Array is:
['0000122995']

Your number(s) output as a "string(s)" is/are:
0000122995

I suspect, though, you might want to work with a query longer than the string you posted. Post that if you need anything further.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM