[英]pandas: populate df column with values matching index and column in another df
[英]Pandas: Retrieving an Index from a dataframe to populate another df
我试图找到解决方案,但是失败了
我有交易数据特别是信用卡名称的主DF:
transactionId, amount, type, person
1 -30 Visa john
2 -100 Visa Premium john
3 -12 Mastercard jenny
我按人分组,然后按记录数和数量汇总。
person numbTrans Amount
john 2 -130
jenny 1 -12
很好,但是我需要在我的df中添加信用卡类型的维度。 我将使用中的信用卡的df分组
index CreditCardName
0 Visa
1 Visa Premium
2 Mastercard
因此,我无法在主数据框中创建一个名为“ CreditCard_id”的新列,该列使用字符串“ Visa / Visa Premium / Mastercard”提取该列的索引。
transactionId, amount, type, CreditCardId, person
1 -30 Visa 0 john
2 -100 Visa Premium 1 john
3 -12 Mastercard 2 jenny
我在做一些简单的kmeans集群时需要这个,并且需要整数,而不是字符串(或者至少我认为是)
提前致谢
抢
如果您将“ CreditCardName”设置为第二个df的索引,则可以只调用map
:
In [80]:
# setup dummydata
import pandas as pd
temp = """transactionId,amount,type,person
1,-30,Visa,john
2,-100,Visa Premium,john
3,-12,Mastercard,jenny"""
temp1 = """index,CreditCardName
0,Visa
1,Visa Premium
2,Mastercard"""
df = pd.read_csv(io.StringIO(temp))
# crucually set the index column to be the credit card name
df1 = pd.read_csv(io.StringIO(temp1), index_col=[1])
df
Out[80]:
transactionId amount type person
0 1 -30 Visa john
1 2 -100 Visa Premium john
2 3 -12 Mastercard jenny
In [81]:
df1
Out[81]:
index
CreditCardName
Visa 0
Visa Premium 1
Mastercard 2
In [82]:
# now we can call map passing the series, naturally the map will align on index and return the index value for our new column
df['CreditCardId'] = df['type'].map(df1['index'])
df
Out[82]:
transactionId amount type person CreditCardId
0 1 -30 Visa john 0
1 2 -100 Visa Premium john 1
2 3 -12 Mastercard jenny 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.