简体   繁体   English

如何使用正则表达式仅提取字母字符

[英]How to extract only alphabetical characters using regex

I am currently trying to extract only the alphabetical portion of the string and exclude the characters in parentheses or the ones that are alphanumeric.我目前正在尝试仅提取字符串的字母部分并排除括号中的字符或字母数字字符。 Currently when I use my current code it will extract all alphabetical characters including the alphanumeric ones.目前,当我使用当前代码时,它将提取所有字母字符,包括字母数字字符。

df['desc'] = df['description'].str.findall(r'[a-zA-Z]+')

AERONAUTICAL MOBILE (OR) AUS52 AUS57 AUS58 AUS101航空移动(或)AUS52 AUS57 AUS58 AUS101

How do I only get AERONAUTICAL MOBILE from this string using regex?如何使用正则表达式仅从该字符串中获取 AERONAUTICAL MOBILE?

Assuming that the all alpha portion the description would always start at the beginning of the string, we can use str.extract as follows:假设描述的所有字母部分总是从字符串的开头开始,我们可以使用str.extract如下:

df["desc"] = df["description"].str.extract(r'^([a-z]+(?: [a-z]+)*)', flags=re.I)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM