匹配列值並將行復制到新的 df

Question

我正在完成一項我慘遭失敗的作業。 我需要遍歷一個數據框，根據條件選擇行，然后將該行復制到另一個數據框。 嘗試使用 df.append()，它似乎可以正常工作，但會掩埋我的機器並為每一行發出棄用警告。 它嘗試了 pd.concat() 但語法不正確。 我的錯誤是它與我不關心的其他列不匹配。

大約有 20k 行，所以它應該花費太長時間。 我很清楚這一點。

是的，我也在使用 iterrows。 如果我需要提供更多細節，請告訴我。

謝謝

KeyError：“[Index([1.0, 'A', '9/1/2004', 'Math', 4, '1'], dtype='object')] 都不在 [columns] 中”

這是我所擁有的：

#get rows that are quantitative and match at least one other row on studentID, classDate and IQ
df_isquant = pd.DataFrame([])


for index, row in df_quant.iterrows():
    if row['IQ']== '1':
        for yndex, roe in df_quant.iterrows():
            if roe['IQ'] == row['IQ'] and roe['StudentID'] == row['StudentID'] and roe['ClassDate'] == row['ClassDate']:
                pd.concat(df_isquant[row])
#             df_isquant.append(row)

我正在搜索值為“1”的行，如果有，則查看該行是否與“StudentID”、“IQ”和“ClassDate”上的任何其他行匹配。 如果是這樣，請復制到另一個數據框。 我也可以簡單地創建另一列並使用布爾值來標記符合該描述的行，這可能會使這更容易。 但這給了我足夠的悲傷，我現在需要答案。

Answer 1

給定提供的邏輯（“我正在搜索值為 '1' 的行，如果有，則查看該行是否與 'StudentID'、'IQ' 和 'ClassDate' 上的任何其他行匹配。 ”），使用布爾索引和concat ：

# condition on IQ
m1 = df_quant['IQ'].eq('1')
# are there other rows matching the 3 columns
m2 = df_quant[['ID', 'StudentID', 'ClassDate']].duplicated(keep=False)

# concat
df_isquant = pd.concat([df_isquant, df_quant[m1&m2]])

匹配列值並將行復制到新的 df

問題描述

1 個解決方案

解決方案1
0 已采納 2022-12-17 22:34:16

匹配列值並將行復制到新的 df

問題描述

1 個解決方案

解決方案1 0 已采納 2022-12-17 22:34:16

解決方案1
0 已采納 2022-12-17 22:34:16