如何根據標題行將數據幀拆分為多個數據幀

Question

我需要根據數據幀中重新出現的標頭行將數據幀分為3個唯一的數據幀。

我的數據框看起來像：

        0         1             2     ....   14
0   Alert     Type      Response           Cost
1     w1        x1            y1            z1
2     w2        x2            y2            z3
.      .         .             .             .
.      .         .             .             .
144 Alert     Type      Response           Cost
145   a1        b1            c1             d1
146   a2        b2            c2             d2

我試圖獲取包含單詞“ Alert”的索引編號，並將loc切片為子數據幀。

indexes = df.index[df.loc[df[0] == "Alert"]].tolist()

但這返回：

IndexError: arrays used as indices must be of integer (or boolean) type

關於該錯誤的任何提示，或者甚至還有我看不到的方法（例如，像group by這樣的東西？）

謝謝你的幫助。

Answer 1

`np.split`

dfs = np.split(df, np.flatnonzero(df[0] == 'Alert')[1:])

說明

查找df[0]等於'Alert'
```
 np.flatnonzero(df[0] == 'Alert') 
```
忽略第一個，因為我們不需要一個空列表元素
```
 np.flatnonzero(df[0] == 'Alert')[1:] 
```

使用np.split獲取列表

 np.split(df, np.flatnonzero(df[0] == 'Alert')[1:])

顯示結果

print(*dfs, sep='\n\n')

      0     1         2     14
0  Alert  Type  Response  Cost
1     w1    x1        y1    z1
2     w2    x2        y2    z3

        0     1         2     14
144  Alert  Type  Response  Cost
145     a1    b1        c1    d1
146     a2    b2        c2    d2

Answer 2

@piRSquared答案的效果很好，所以讓我向您解釋錯誤。

這是獲取第一個元素為Alert的索引的方法：

indexes = list(df.loc[df['0'] == "Alert"].index)

您的錯誤是由於df.index是pandas.RangeIndex對象，因此無法進一步建立索引而引起的。

然后，您可以使用列表理解來拆分數據框，如下所示：

listdf = [df.iloc[i:j] for i, j in zip(indexes, indexes[1:] + [len(df)])]

如何根據標題行將數據幀拆分為多個數據幀

問題描述

2 個解決方案

解決方案1
2 已采納 2019-06-06 16:07:19

`np.split`

說明

顯示結果

解決方案2
2 2019-06-06 16:25:02

如何根據標題行將數據幀拆分為多個數據幀

問題描述

2 個解決方案

解決方案1 2 已采納 2019-06-06 16:07:19

np.split

說明

顯示結果

解決方案2 2 2019-06-06 16:25:02

解決方案1
2 已采納 2019-06-06 16:07:19

`np.split`

解決方案2
2 2019-06-06 16:25:02