如何按列名过滤值，然后将具有相同值的行提取到另一个CSV文件？ Python /熊猫

Question

I have a pandas DataFrame with 4 columns, the first being "ID NUMBER". 我有一个带有4列的pandas DataFrame，第一列是“ ID NUMBER”。 I am trying to filter "ID NUMBER" and get the same values bundled together. 我正在尝试过滤“ ID NUMBER”并将相同的值捆绑在一起。 After that I want to extract each one that have the same values to a different csv file with their respected name. 之后，我想将每个具有相同值的名称提取到具有相应名称的另一个csv文件中。

DataFrame: 数据框：

     ID Number    col2           col3     DATE
0   111            0.5          -0.6    20160104
1   118           -0.1          -0.6    20160104
2   11D            0.3          -1.1    20160104
3   111           -0.7          -0.9    20150102


 ***Output I need:***
 Number ID    col2           col3     DATE
0   111            0.5          -0.6    20160104
1   111           -0.7          -0.9    20150102

I have attempted to do something, however I could not find anything about how to filter a columns, and then extract online. 我尝试做一些事情，但是我找不到关于如何过滤列然后在线提取的任何信息。 Thank you! 谢谢！

Answer 1

You can use duplicated with param keep=False so it returns True for all duplicated rows and mask the df: 您可以将duplicated与param keep=False一起使用，以便为所有重复的行返回True并屏蔽df：

In [16]:
df[df['ID Number'].duplicated(keep=False)]

Out[16]:
  ID Number  col2  col3      DATE
0       111   0.5  -0.6  20160104
3       111  -0.7  -0.9  20150102

For the second part you can do: 对于第二部分，您可以执行以下操作：

gp = df[df['ID Number'].duplicated(keep=False)].groupby('ID Number')
gp.apply(lambda x: x.to_csv(str(x.name) + '.csv')

EDIT 编辑

Actually if you're just wanting to write all rows with the same ID number to a named csv then: 实际上，如果您只想将具有相同ID号的所有行写入命名的csv，则：

df.groupby('ID Number').apply(lambda x: x.to_csv(str(x.name) + '.csv'))

Should do what you want 应该做你想做的

如何按列名过滤值，然后将具有相同值的行提取到另一个CSV文件？ Python /熊猫

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-05-20 20:31:42

如何按列名过滤值，然后将具有相同值的行提取到另一个CSV文件？ Python /熊猫

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-05-20 20:31:42

解决方案1
2 已采纳 2016-05-20 20:31:42