[英]How To Extract Three Letters Followed By Five Digits Using Regex in Python
I have the following dataframe in Python:我在 Python 中有以下 dataframe:
abc12345 abc12345
abc1234 abc1234
abc1324. abc1324。
How do I extract only the ones that have three letters followed by five digits?如何仅提取三个字母后跟五个数字的字母?
The desired result would be:期望的结果是:
abc12345. abc12345。
df.column.str.extract('[^0-9](\d\d\d\d\d)$')
I think this works, but is there any better way to modify (\d\d\d\d\d)?我认为这可行,但是有没有更好的方法来修改(\d\d\d\d\d)? What if I had like 30 digits.
如果我有 30 位数字怎么办? Then I'll have to type \d 30 times, which is inefficient.
然后我必须输入 \d 30 次,这是低效的。
You should be able to use:您应该能够使用:
'[a-zA-Z]{3}\d{5}'
If the strings don't include capital letters this can reduce to:如果字符串不包含大写字母,则可以减少为:
'[a-z]{3}\d{5}'
Change the values in the {x}
to adjust the number of chars to capture.更改
{x}
中的值以调整要捕获的字符数。
Or like this following code:或者像下面这样的代码:
' import re ' 重新导入
s = "abc12345" s = "abc12345"
p = re.compile(r"\d{5}") p = re.compile(r"\d{5}")
c = p.match(s,3) c = p.match(s,3)
print(c.group()) '打印(c.group())'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.