简体   繁体   English

使用Python / Pandas在.csv中附加一列

[英]Appending a column in .csv with Python/Pandas

Intro Python question: I am working on a program that counts the number of politicians in each political party for each session of the US Congress. Python简介问题:我正在开发一个程序,该程序计算美国国会每一届会议每个政党中政客的人数。 I'm starting from a .csv with biographical data, and wish to export my political party membership count as a new .csv. 我从具有个人资料的.csv开始,希望将我的政党成员人数导出为新的.csv。 This is what I'm doing: 这就是我在做什么:

import pandas as pd

read = pd.read_csv('30.csv', delimiter = ';', names = ['Name', 'Years', 'Position', 'Party', 'State', 'Congress'])

party_count = read.groupby('Party').size()

with open('parties.csv', 'a') as f:
    party_count.to_csv(f, header=False)

This updates my .csv to read as follows: 这会将我的.csv更新如下:

'Year','Party','Count'
'American Party',1
'Democrat',162
'Independent Democrat',3
'Party',1
'Whig',145

I next need to include the date under my first column ('Year'). 接下来,我需要在第一列(“年份”)下添加日期。 This is contained in the 'Congress' column in my first .csv. 这包含在我的第一个.csv的“会议”列中。 What do I need to add to my final line of code to make this work? 要完成此工作,我需要在最后一行代码中添加什么?

Here is a snippet from the original .csv file I am drawing from: 这是我从中提取的原始.csv文件的摘录:

'Name';'Years';'Position';'Party';'State';'Congress'
'ABBOTT, Amos';'1786-1868';'Representative';'Whig';'MA';'1847'
'ADAMS, Green';'1812-1884';'Representative';'Whig';'KY';'1847'
'ADAMS, John Quincy';'1767-1848';'Representative';'Whig';'MA';'1847'

You can merge back the counts of Party to your original dataframe by: 您可以通过以下方式将Party的计数合并回原始数据框:

party_count = df.groupby('Party').size().reset_index(name='Count')
df = df.merge(party_count, on='Party', how='left')

Once you have the count of parties now you can select your data. 现在,有了当事方计数后,您就可以选择数据了。 For eg: If you need [Congress, Party, Count] you can use: 例如:如果您需要[大会,聚会,计数],则可以使用:

out_df = df[['Congress ', 'Party', 'Count']].drop_duplicates()
out_df.columns = ['Year', 'Party', 'Count']

Here, out_df being the dataframe you can write to my.csv file. 在这里,out_df是您可以写入my.csv文件的数据帧。

out_df.to_csv('my.csv', index=False)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM