Python Pandas - 基於先前獲取的子集從DataFrame中刪除行

Question

我正在運行安裝了Pandas 0.11.0庫的Python 2.7 。

我一直在尋找一個沒有找到這個問題的答案，所以我希望有人比我有解決方案更有經驗。

讓我們說我的數據，在df1中，如下所示：

df1=

  zip  x  y  access
  123  1  1    4
  123  1  1    6
  133  1  2    3
  145  2  2    3
  167  3  1    1
  167  3  1    2

例如，使用df2 = df1[df1['zip'] == 123]然后df2 = df2.join(df1[df1['zip'] == 133])我得到以下數據子集：

df2=

 zip  x  y  access
 123  1  1    4
 123  1  1    6
 133  1  2    3

我想做的是：

1）從df1刪除行，因為它們是用df2定義/連接的

要么

2）之后， df2被創建，從刪除行（區別？） df1其df2是由

希望所有這一切都有意義。 如果需要更多信息，請告訴我。

編輯：

理想情況下，第三個數據框將是創建的，如下所示：

df2=

 zip  x  y  access
 145  2  2    3
 167  3  1    1
 167  3  1    2

也就是說， df1中的所有內容都不在df2 。 謝謝！

Answer 1

我想到了兩種選擇。 首先，使用isin和一個掩碼：

>>> df
   zip  x  y  access
0  123  1  1       4
1  123  1  1       6
2  133  1  2       3
3  145  2  2       3
4  167  3  1       1
5  167  3  1       2
>>> keep = [123, 133]
>>> df_yes = df[df['zip'].isin(keep)]
>>> df_no = df[~df['zip'].isin(keep)]
>>> df_yes
   zip  x  y  access
0  123  1  1       4
1  123  1  1       6
2  133  1  2       3
>>> df_no
   zip  x  y  access
3  145  2  2       3
4  167  3  1       1
5  167  3  1       2

其次，使用groupby ：

>>> grouped = df.groupby(df['zip'].isin(keep))

然后任何一個

>>> grouped.get_group(True)
   zip  x  y  access
0  123  1  1       4
1  123  1  1       6
2  133  1  2       3
>>> grouped.get_group(False)
   zip  x  y  access
3  145  2  2       3
4  167  3  1       1
5  167  3  1       2
>>> [g for k,g in list(grouped)]
[   zip  x  y  access
3  145  2  2       3
4  167  3  1       1
5  167  3  1       2,    zip  x  y  access
0  123  1  1       4
1  123  1  1       6
2  133  1  2       3]
>>> dict(list(grouped))
{False:    zip  x  y  access
3  145  2  2       3
4  167  3  1       1
5  167  3  1       2, True:    zip  x  y  access
0  123  1  1       4
1  123  1  1       6
2  133  1  2       3}
>>> dict(list(grouped)).values()
[   zip  x  y  access
3  145  2  2       3
4  167  3  1       1
5  167  3  1       2,    zip  x  y  access
0  123  1  1       4
1  123  1  1       6
2  133  1  2       3]

哪個最有意義取決於上下文，但我認為你明白了。

Python Pandas - 基於先前獲取的子集從DataFrame中刪除行

問題描述

1 個解決方案

解決方案1
25 已采納 2013-05-23 03:02:13

Python Pandas - 基於先前獲取的子集從DataFrame中刪除行

問題描述

1 個解決方案

解決方案1 25 已采納 2013-05-23 03:02:13

解決方案1
25 已采納 2013-05-23 03:02:13