如何在給定位置選擇包含特定子字符串的行 - python

Question

我正在使用一個看起來像這樣的大數據框：

     id      time1      time2   data    
0   id1   06:24:00   06:24:00      A
1   id2   07:24:00   07:24:00      A
2   id3   08:24:00   08:24:00      B

我想選擇所有具有23:xx:yy格式的time1和/或time2行。

我嘗試使用以下代碼，但速度非常慢，因此我正在尋找更有效的方法：

list_ = list()

for idx in df.index:
    if ('23' in df.time1[:2]) | ('23' in df.time2[:2]):
        list_.append(df.loc[df.index == idx])  ###--- Here I wanted to get a list of indexes so I could do a simple df.loc[] afterward

我還嘗試了以下代碼，但所有代碼都引發了錯誤：

df.loc[df.time1[:2] == '23']
df.loc['23' in df.time1[:2]]
df[df.time1[:2].str.contains('23')]

> IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match

有沒有辦法做到這一點？ 任何幫助將不勝感激。

Answer 1

使用Series.str.startswith與| 對於按位OR或&對於按位AND ：

df[df.time1.str.startswith('23') | df.time2.str.startswith('23')]

如果要比較字符串的前 2 個值，請添加str[:2]以進行索引：

df[df.time1.str[:2].eq('23') | df.time2.str[:2].eq('23')]

Answer 2

要添加到 jezrael 答案，如果列數據是日期時間，您可以這樣做

df[(df.time1.dt.hour == 23)|(df.time2.dt.hour == 23)]

如何在給定位置選擇包含特定子字符串的行 - python

問題描述

2 個解決方案

解決方案1
5 已采納 2020-10-15 09:59:44

解決方案2
2 2020-10-15 10:02:54

如何在給定位置選擇包含特定子字符串的行 - python

問題描述

2 個解決方案

解決方案1 5 已采納 2020-10-15 09:59:44

解決方案2 2 2020-10-15 10:02:54

解決方案1
5 已采納 2020-10-15 09:59:44

解決方案2
2 2020-10-15 10:02:54