Python - 在 dataframe 列中查找以元音開頭和結尾的單詞

Question

我試圖在dataframe列中找到以元音開頭和結尾的單詞。

我找不到 (1) 找到所有以元音開頭的單詞的regex方法。 我只能找到以某個元音開頭的單詞。

這是我使用的code ：-

# import the CSV file
sales_data = pd.read_csv ("data/sales-data.csv")

#Words starting with 'A'. This works
Vowels1 = sales_data[sales_data['CUSTOMERNAME'].str.startswith('A')]

#Words starting with vowel. This doesn't work. Why?
Vowels2 = sales_data[sales_data['CUSTOMERNAME'].str.startswith(r'[aeiouAEIOU]')]

如何添加以元音開始和結束（同時）的條件？

#This should work, but it doesn't.
Vowels3 = sales_data[sales_data['CUSTOMERNAME'].str.startswith(r'^[aeiou].*[aeiou]$')]

The message I get for Vowels2 and Vowels3 is:
Empty DataFrame
Columns: [ORDERID, ORDERPRICE, ORDERDATE, STATUS, PRODUCTLINE, PRODUCTCODE, CUSTOMERNAME, CITY, COUNTRY]
Index: []

謝謝

Answer 1

你可以在這里使用str.contains ：

Vowels3 = sales_data[sales_data['CUSTOMERNAME'].str.contains(r'^[aeiou].*[aeiou]\.?$', flags=re.IGNORECASE)]

Answer 2

Startswith 和 Endswith 接受元組，因此您可以使用它們：

vowels = ('a','e','i','o','u','A','E','I','O','U')
if myword.startswith(vowels) and myword.endswith(vowels):
    print("Yes")

Answer 3

因為您只對第一個和最后一個字母感興趣，所以您不需要正則regexp開銷，甚至不需要查找序列的startwith 。

相反，您可以將 lambda lam apply列：

v = ('a','e','i','o','u','A','E','I','O','U')
lam = lambda word: word[0] in v and word[-1] in v

請注意這里不處理空字符串的情況

Python - 在 dataframe 列中查找以元音開頭和結尾的單詞

問題描述

3 個解決方案

解決方案1
1 2021-03-13 15:34:16

解決方案2
0 2021-03-13 15:37:37

解決方案3
0 2021-03-13 15:43:45

Python - 在 dataframe 列中查找以元音開頭和結尾的單詞

問題描述

3 個解決方案

解決方案1 1 2021-03-13 15:34:16

解決方案2 0 2021-03-13 15:37:37

解決方案3 0 2021-03-13 15:43:45

解決方案1
1 2021-03-13 15:34:16

解決方案2
0 2021-03-13 15:37:37

解決方案3
0 2021-03-13 15:43:45