在 Python 中跨多個列應用 str.contains 的問題

Question

Dataframe：

col1          col2             col3
132jh.2ad3    34.2             65
298.487       9879.87          1kjh8kjn0
98.47         79.8             90
8763.3        7hkj7kjb.k23l    67
69.3          3765.9           3510

所需的 output：

col1          col2             col3
98.47         79.8             90
69.3          3765.9           3510

我嘗試過的：（這不會刪除所有帶有字母數字值的行）

df=df[~df['col1'].astype(str).str.contains(r'[A-Ba-b]')] #for col1
df=df[~df['col2'].astype(str).str.contains(r'[A-Ba-b]')] #for col2
df=df[~df['col3'].astype(str).str.contains(r'[A-Ba-b]')] #for col3

我想刪除所有字母數字行，並且只有包含數字的行。 Col1 和 Col2 有小數點，但 Col3 只有整數。
我已經嘗試了一些其他類似的線程，但它沒有用。

謝謝您的幫助！！

Answer 1

您可以只使用to_numeric ：

df[df.apply(pd.to_numeric, errors='coerce').notnull().all(1)]

Output：

    col1    col2  col3
2  98.47    79.8    90
4   69.3  3765.9  3510

Answer 2

跑：

df[~df.apply(lambda row: row.str.contains(r'[A-Z]', flags=re.I).any(), axis=1)]

（需要重新導入）。

您的正則表達式包含[AB] ，但它應該匹配所有字母（從A到Z ）。

編輯

如果您還有其他列，但您想將您的標准限制為僅指定的 3 個列，假設它們是連續的列，請運行：

df[~df.loc[:, 'col1':'col3'].apply(lambda row:
    row.str.contains(r'[A-Z]', flags=re.I).any(), axis=1)]

這樣，您將與上面相同的 function 應用於這 3 列。

Answer 3

這是一個不需要使用apply （可能很慢）而是stack的解決方案

# stack and use isnumeric to see if str is a number or float
# then unstack and dropna
df[df.stack().str.replace('.','').str.isnumeric().unstack()].dropna()

    col1    col2  col3
2  98.47    79.8    90
4   69.3  3765.9  3510

在 Python 中跨多個列應用 str.contains 的問題

問題描述

3 個解決方案

解決方案1
4 2020-04-03 17:17:17

解決方案2
1 已采納 2020-04-03 17:06:15

編輯

解決方案3
0 2020-04-03 17:12:46

在 Python 中跨多個列應用 str.contains 的問題

問題描述

3 個解決方案

解決方案1 4 2020-04-03 17:17:17

解決方案2 1 已采納 2020-04-03 17:06:15

編輯

解決方案3 0 2020-04-03 17:12:46

解決方案1
4 2020-04-03 17:17:17

解決方案2
1 已采納 2020-04-03 17:06:15

解決方案3
0 2020-04-03 17:12:46