在熊貓數據框中的塊中查找第一個“真”值

Question

我有一個數據框，其中一列僅在塊中包含True或False值。 例如：

df =   
            b
  0     False
  1      True
  2      True
  3     False
  4      True
  5      True
  6      True
  7      True
  8     False
  9     False
 10     False
 11     False
 12     False
 13      True
 14      True
 15      True

我需要找到True塊的開頭：

>> find_first_true(df)
>> array([1, 4, 13])

有什么優雅的解決方案嗎？

編輯

感謝您提出的解決方案。 我想知道，從我發現的索引開始提取特定長度的塊的最簡單方法是什么？

例如，我需要在索引之前取長度為4的塊（行數）。 因此，如果我的索引（先前找到）

index = array([1, 4, 13])

然后我需要塊：

[df.loc[0:4], df.loc[9:13]]

要么

            b
  0     False
  1      True
  2      True
  3     False
  4      True
  9     False
 10     False
 11     False
 12     False
 13      True

我正在遍歷索引，但想知道更多的熊貓解決方案

Answer 1

In [2]: df = pd.read_clipboard()
In [3]: df
Out[3]:
        b
0   False
1    True
2    True
3   False
4    True
5    True
6    True
7    True
8   False
9   False
10  False
11  False
12  False
13   True
14   True
15   True
In [11]: np.where(((df.b != df.b.shift(1)) & df.b).values)[0]
Out[11]: array([ 1,  4, 13], dtype=int64)

Answer 2

def find_first_true(df):
    #finds indexes of true elements
    a = list(map(lambda e: e[0] + 1 if e[1] else 0, enumerate(df))) 
    a = list(filter(bool, a))
    a = list(map(lambda x: x - 1, a))

    #removes consecutive elements   
    ta = [a[0]] + list(filter(lambda x: a[x] - a[x-1] != 1, range(1, len(a))))  
    a = list(map(lambda x: a[x], ta))   

    return a

Answer 3

find_first = []
for i in range(len(df)):
    if (df.loc[i, 'b'] == False and df.loc[i+1, 'b'] == True):
        find_first.append(i+1)

在熊貓數據框中的塊中查找第一個“真”值

問題描述

3 個解決方案

解決方案1
3 已采納 2017-07-31 13:35:00

解決方案2
1 2017-07-31 13:43:05

解決方案3
1 2017-07-31 13:43:18

在熊貓數據框中的塊中查找第一個“真”值

問題描述

3 個解決方案

解決方案1 3 已采納 2017-07-31 13:35:00

解決方案2 1 2017-07-31 13:43:05

解決方案3 1 2017-07-31 13:43:18

解決方案1
3 已采納 2017-07-31 13:35:00

解決方案2
1 2017-07-31 13:43:05

解決方案3
1 2017-07-31 13:43:18