[英]Get unique counts from one data frame as values in another data frame in Pandas
I have two pandas dataframes and I want to get some unique row counts from one dataframe ( responses
) as column values in the other dataframe ( contacts
) 我有两个熊猫数据框,我想从一个数据框(
responses
)中获得一些唯一的行计数,作为另一数据框( contacts
)中的列值
import pandas as pd
contacts = pd.read_csv('contacts.csv', encoding='ISO-8859-1')
responses = pd.read_csv('campaign_responses.csv', encoding='ISO-8859-1')
contacts.head()
contact_id job_title country Email Webinar
0 0031B00002cPLuFQAW manager US 0 0
1 0031B00002Z2zMYQAZ admin UK 0 0
2 003a000001nHioCAAS manager DE 0 0
Note: Email and Webinar will be 0 for all rows. 注意:所有行的“电子邮件”和“网络研讨会”均为0。 They're placeholder values for the moment.
目前,它们是占位符。
responses.head()
campaign_type contact_id
0 Email 0031B00002cPLuFQAW
1 Webinar 0031B00002Z2zMYQAZ
2 Webinar 0031B00002cPLuFQAW
3 Webinar 0031B00002cPLuFQAW
4 Email 003a000001nHioCAAS
5 Email 003a000001nHioCAAS
I'd like to get a count of how many times each contact has responded to each campaign type as an attribute in the contacts data frame. 我想作为联系人数据框中的一个属性,计算每个联系人对每种广告系列类型做出响应的次数。
The final contacts
data frame should look like this (based on the data above) 最终
contacts
数据框应如下所示(基于上面的数据)
contact_id job_title country Email Webinar
0 0031B00002cPLuFQAW manager US 1 2
1 0031B00002Z2zMYQAZ admin UK 0 1
2 003a000001nHioCAAS manager DE 2 0
Seems like you need 好像你需要
pd.crosstab(df.contact_id,df.campaign_type)
Out[37]:
campaign_type Email Webinar
contact_id
0031B00002Z2zMYQAZ 0 1
0031B00002cPLuFQAW 1 2
003a000001nHioCAAS 2 0
Short and simple: 简短:
df.groupby(['contact_id', 'campaign_type']).size().unstack('type', fill_value=0)
Edit: neither short nor simple, see other answer. 编辑:既不简短也不简单,请参见其他答案。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.