简体   繁体   中英

How do I match a string up to a number in python using regular expressions?

I'm new to regular expressions, but I'd like to match a string up to when the numbers start.

So just say I have:

EEEE1234

Then I would like to extract only:

EEEE

I tried searching, but I find regular expressions confusing and the best way I think is through examples. Any thoughts? Also, any insight into any regex code generators or good tutorials on this?

Use \\D to mean "not a digit":

r"^\D+"

Example:

import re

s = "EEEE1234"
print re.match(r"^\D+",s).group(0)

See it working online: ideone

You've already got some recommendations for tutorials, but I'd like to also add that if you haven't yet seen the documentation for the re module , you should bookmark that and read that after you've read a more basic tutorial. The documentation is not beginner level but it has some very useful tips that are specific to using regular expressions in Python, and there also some examples near the end.

  • \\d = one digit (numbers 0 through 9)
  • \\D = one non-digit
  • \\D+ = one or more non-digits
  • \\D+\\d = one or more non-digits followed by one digit
  • (\\D+)\\d = one or more non-digits captured in a group followed by one digit

So, if you have a string

str = 'EEEE1234'

then you can import re and use re.match to match the regular expression on the string:

re.match(r'(\D+)\d', str)

This will get you a match object, from which you can extract the contents of the group:

re.match(r'(\D+)\d', str).group(1)

This will contain EEEE .

Perhaps one thing that might help is to view regular expressions as a tool that, first of all, performs matching operations. The searching, substitution, and string splitting are all consequences of this ability. One example, depending on how you'd like to extract the desired parts:

r"^(\D+)\d*"

This regex uses a capturing group that you can later reference.

For learning purposes, there are many resources, as has been mentioned. If you're interested in how regexes work, or want to understand them a little better, you may want to read a bit about regular languages .

If we're specifically looking for when the letters meet numbers, I'd do something like:

re.search(r'[a-zA-Z]+(?=\d+)')

Which will match it only when it's followed by numbers, but not return the numbers themselves. That way you also get to avoid groups, which can be messy.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM