繁体   English   中英

将时间序列数据与大熊猫中的元数据相结合的正确方法是什么?

[英]What is the proper way of combining time series data with metadata in pandas?

我有两个csv文件:

customer.csv

id  name     birthday
1   Martin   28.04.1990
2   Twain    30.11.1835
....

purchases.csv

purchase_id    customer_id    item                            price
1              1              About the ugly German language  3.14
2              1              Food                            15.92
3              1              Book                            65.35
4              2              Stone                           89.79

我可以将两个数据框加载为

df_customers = pd.read_csv('customers.csv')
df_purchases = pd.read_csv('purchases.csv')

但是如何将两者结合起来,这样我就可以轻松回答以下问题:

  • 每个客户购买了多少商品?
  • 每个客户的平均商品价格是多少?

mergeright merge一起使用:

df = pd.merge(df_customers, df_purchases, left_on='id', right_on='customer_id', how='right')
print (df)
   purchase_id  customer_id                            item  price
0            1            1  About the ugly German language   3.14
1            2            1                            Food  15.92
2            3            1                            Book  65.35
3            4            2                           Stone  89.79
   id    name    birthday  purchase_id  customer_id  \
0   1  Martin  28.04.1990            1            1   
1   1  Martin  28.04.1990            2            1   
2   1  Martin  28.04.1990            3            1   
3   2   Twain  30.11.1835            4            2   

                             item  price  
0  About the ugly German language   3.14  
1                            Food  15.92  
2                            Book  65.35  
3                           Stone  89.79  

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM