使用條件刪除 Python DataFrame 行中的行

Question

我正在嘗試刪除從文件導入並連接我的數據框列表后不需要的數據行。 這是我當前的 DataFrame 的樣子：

                            Best Movie
0                        Movie: Orphan
1                                   2.
2                        Movie: Avatar
3                                   3.
4          Movie: Inglourious Basterds
...                                ...
2371  Movie: The Deep End of the Ocean
2372                               49.
2373         Movie: Drop Dead Gorgeous
2374                               50.
2375                         Movie: Go

我需要刪除所有僅包含數字的行，因此結果如下所示：

                            Best Movie
0                        Movie: Orphan
2                        Movie: Avatar
4          Movie: Inglourious Basterds
...                                ...
2371  Movie: The Deep End of the Ocean
2373         Movie: Drop Dead Gorgeous
2375                         Movie: Go

謝謝您的幫助

Answer 1

一種使用str.match的解決方案

mask = ~df["Best Movie"].str.match(r"^\s*\d+\.$")
res = df[mask]
print(res)

Output

                         Best Movie
0                     Movie: Orphan
2                     Movie: Avatar
4       Movie: Inglourious Basterds
5  Movie: The Deep End of the Ocean
7         Movie: Drop Dead Gorgeous
9                         Movie: Go

更新

要替換“電影：”並重置索引，請執行以下操作：

res = df[mask].reset_index()
res = res["Best Movie"].str.replace(r"^\s*Movie:", "", regex=True)
print(res)

Output

0                        Orphan
1                        Avatar
2          Inglourious Basterds
3     The Deep End of the Ocean
4            Drop Dead Gorgeous
5                            Go
Name: Best Movie, dtype: object

Answer 2

你可以做：

df.loc[~df['Best Movie'].str.match('^\d+.$')]

Answer 3

樣本輸入

df = pd.DataFrame({
    
    "Best_Movie": ["Movie: Orphan", "2.", "Movie: Avatar", "3."]
})

應用 pd.to_numeric。 只有數字的行將被轉換為浮點數，其他行將被標記為 NaN。

df["nums"] = pd.to_numeric(df['Best_Movie'], errors='coerce')

提取具有文本的行（即標記為 nan 的行）

df.loc[df.nums.isnull(), "Best_Movie"]

樣品 output

0    Movie: Orphan
2    Movie: Avatar
Name: Best_Movie, dtype: object

Answer 4

試試下面的。 '|' 基本上是手段或在這種情況下

df[~df['Best Movie'].str.contains('|'.join(str(i) for i in range(10)))]

使用條件刪除 Python DataFrame 行中的行

問題描述

4 個解決方案

解決方案1
2 2022-07-29 18:28:00

解決方案2
1 2022-07-29 18:32:32

解決方案3
0 2022-07-29 18:29:09

解決方案4
0 2022-07-29 18:29:14

使用條件刪除 Python DataFrame 行中的行

問題描述

4 個解決方案

解決方案1 2 2022-07-29 18:28:00

解決方案2 1 2022-07-29 18:32:32

解決方案3 0 2022-07-29 18:29:09

解決方案4 0 2022-07-29 18:29:14

解決方案1
2 2022-07-29 18:28:00

解決方案2
1 2022-07-29 18:32:32

解決方案3
0 2022-07-29 18:29:09

解決方案4
0 2022-07-29 18:29:14