當找到列中的特定字符串時，在子數據框中切片 Dataframe

Question

假設我有 dataframe df，我想將它分成多個數據幀並將每個數據幀存儲在一個列表 (list_of_dfs) 中。

每個子數據框應僅包含“結果”行。 一個子數據幀開始，在“Point”列中給出值“P1”，在“X_Y”列中給出值“X”。

我嘗試這樣做，首先找到每個“P1”的索引，然后使用“P1”的索引在列表理解中對整個 dataframe 進行切片。 但是我收到了一個包含兩個空數據框的列表。 有人可以建議嗎？ 謝謝！

import pandas as pd

df = pd.DataFrame(
    {
        "Step": (
            "1", "1", "1", "1", "1", "2", "2", "2", "2", "2", "Result", "Result", "Result", "Result", "Result",
            "1", "1", "1", "1", "1", "2", "2", "2", "2", "2", "Result", "Result", "Result", "Result", "Result"
        ),
        "Point": (
            "P1", "P2", "P2", "P3", "P3", "P1", "P2", "P2", "P3", "P3", "P1", "P2", "P2", "P3", "P3",
            "P1", "P2", "P2", "P3", "P3", "P1", "P2", "P2", "P3", "P3", "P1", "P2", "P2", "P3", "P3",
        ),
        "X_Y": (
            "X", "X", "Y", "X", "Y",  "X", "X", "Y", "X", "Y", "X", "X", "Y", "X", "Y", 
            "X", "X", "Y", "X", "Y",  "X", "X", "Y", "X", "Y", "X", "X", "Y", "X", "Y",
        ),
        "Value A": (
            70, 68, 66.75, 68.08, 66.72, 70, 68, 66.75, 68.08, 66.72, 70, 68, 66.75, 68.08, 66.72,
            70, 68, 66.75, 68.08, 66.72, 70, 68, 66.75, 68.08, 66.72, 70, 68, 66.75, 68.08, 66.72, 
        ),
        "Value B": (
            70, 68, 66.75, 68.08, 66.72, 70, 68, 66.75, 68.08, 66.72, 70, 68, 66.75, 68.08, 66.72,
            70, 68, 66.75, 68.08, 66.72, 70, 68, 66.75, 68.08, 66.72, 70, 68, 66.75, 68.08, 66.72,
        ),
    }
)

dff = df.loc[df["Step"] == "Result"]

value = "P1"
tuple_of_positions = list()

result = dff.isin([value])

seriesObj = result.any()
columnNames = list(seriesObj[seriesObj == True].index)

for col in columnNames:
    rows = list(result[col][result[col] == True].index)
    for row in rows:
        tuple_of_positions.append((row, col))

length_of_one_df = (len(dff["Point"].unique().tolist()) * 2 ) - 1

list_of_dfs = [dff.iloc[x : x + length_of_one_df] for x in rows]

print(list_of_dfs)

Answer 1

sub    = df.query("Step == \"Result\"")
pivots = sub[["Point", "X_Y"]].eq(["P1", "X"]).all(axis=1)
out    = [fr for _, fr in sub.groupby(pivots.cumsum())]

獲取幀的子集，其中 Step 等於“Result”
檢查哪些行有“P1”和“X”序列
- 給出真/假系列
- 它的累積總和確定該組作為“樞軸”（轉向）點將為真，因為在數字上下文中為 False == 0
- 迭代 GroupBy object 產生“group_label，sub_frame”對，我們從中提取 sub_frames

要得到

>>> out

[      Step Point X_Y  Value A  Value B
 10  Result    P1   X    70.00    70.00
 11  Result    P2   X    68.00    68.00
 12  Result    P2   Y    66.75    66.75
 13  Result    P3   X    68.08    68.08
 14  Result    P3   Y    66.72    66.72,
       Step Point X_Y  Value A  Value B
 25  Result    P1   X    70.00    70.00
 26  Result    P2   X    68.00    68.00
 27  Result    P2   Y    66.75    66.75
 28  Result    P3   X    68.08    68.08
 29  Result    P3   Y    66.72    66.72]

中間人在哪里

>>> sub

      Step Point X_Y  Value A  Value B
10  Result    P1   X    70.00    70.00
11  Result    P2   X    68.00    68.00
12  Result    P2   Y    66.75    66.75
13  Result    P3   X    68.08    68.08
14  Result    P3   Y    66.72    66.72
25  Result    P1   X    70.00    70.00
26  Result    P2   X    68.00    68.00
27  Result    P2   Y    66.75    66.75
28  Result    P3   X    68.08    68.08
29  Result    P3   Y    66.72    66.72

>>> pivots 

10     True
11    False
12    False
13    False
14    False
25     True
26    False
27    False
28    False
29    False
dtype: bool

# groups
>>> pivots.cumsum()

10    1
11    1
12    1
13    1
14    1
25    2
26    2
27    2
28    2
29    2
dtype: int32

當找到列中的特定字符串時，在子數據框中切片 Dataframe

問題描述

1 個解決方案

解決方案1
0 已采納 2023-01-07 17:16:42

當找到列中的特定字符串時，在子數據框中切片 Dataframe

問題描述

1 個解決方案

解決方案1 0 已采納 2023-01-07 17:16:42

解決方案1
0 已采納 2023-01-07 17:16:42