[英]How to filter values by Column Name and then extract the rows that have the same value to another CSV file? Python/Pandas
I have a pandas DataFrame with 4 columns, the first being "ID NUMBER". 我有一个带有4列的pandas DataFrame,第一列是“ ID NUMBER”。 I am trying to filter "ID NUMBER" and get the same values bundled together.
我正在尝试过滤“ ID NUMBER”并将相同的值捆绑在一起。 After that I want to extract each one that have the same values to a different csv file with their respected name.
之后,我想将每个具有相同值的名称提取到具有相应名称的另一个csv文件中。
DataFrame: 数据框:
ID Number col2 col3 DATE
0 111 0.5 -0.6 20160104
1 118 -0.1 -0.6 20160104
2 11D 0.3 -1.1 20160104
3 111 -0.7 -0.9 20150102
***Output I need:***
Number ID col2 col3 DATE
0 111 0.5 -0.6 20160104
1 111 -0.7 -0.9 20150102
I have attempted to do something, however I could not find anything about how to filter a columns, and then extract online. 我尝试做一些事情,但是我找不到关于如何过滤列然后在线提取的任何信息。 Thank you!
谢谢!
You can use duplicated
with param keep=False
so it returns True
for all duplicated rows and mask the df: 您可以将
duplicated
与param keep=False
一起使用,以便为所有重复的行返回True
并屏蔽df:
In [16]:
df[df['ID Number'].duplicated(keep=False)]
Out[16]:
ID Number col2 col3 DATE
0 111 0.5 -0.6 20160104
3 111 -0.7 -0.9 20150102
For the second part you can do: 对于第二部分,您可以执行以下操作:
gp = df[df['ID Number'].duplicated(keep=False)].groupby('ID Number')
gp.apply(lambda x: x.to_csv(str(x.name) + '.csv')
EDIT 编辑
Actually if you're just wanting to write all rows with the same ID number to a named csv then: 实际上,如果您只想将具有相同ID号的所有行写入命名的csv,则:
df.groupby('ID Number').apply(lambda x: x.to_csv(str(x.name) + '.csv'))
Should do what you want 应该做你想做的
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.