简体   繁体   English

数据转换-根据 aws 服务和 Pandas 中的百分比计算总成本

[英]data transformation-calculate total cost based on aws services and percentage in pandas

I have dataframe like this: df1:我有这样的数据框:df1:

Product    line_item_product_code       Account    percentage
COMMON      AWSCloudTrail                AU-LOG     20  
COMMON      AWSGlue                      AU-LOG     30
COMMON      AWSQueueService              AU-LOG     50
COMMON      AWSSecretsManager            AU-PRD     40
COMMON      AmazonDynamoDB               AU-PRD     60

Second dataframe: df2第二个数据框:df2

Account             Product               cost
AU-LOG       COMMON-PROD1                  10
AU-LOG       COMMON-PROD2                  12
AU-PRD       COMMON-PROD1                  14
AU-PRD       COMMON-PROD2                  16

Here total cost in df1 for a given account will match the total cost for that account in df2.这里 df1 中给定帐户的总成本将与 df2 中该帐户的总成本相匹配。 The dataframe I want is:我想要的数据框是:

So the calculation is to split total cost for a given account and product in df2 across various aws services based on percentage column in df1 and no of aws services are being used by a particular product.因此,计算是根据 df1 中的百分比列将 df2 中给定帐户和产品的总成本分摊到各种 aws 服务中,并且特定产品没有使用任何 aws 服务。

Ex:In AU-LOG account there 3 different line_item_product_code and percentage is 20,30,50 .In df2 for AU-LOG account COMMON-PROD1 cost is $10 So this $10 will be splitted across 3 different line_item_product_code using the percentage mentioned in df1例如:在AU-LOG帐户中有 3 个不同的line_item_product_code和百分比是20,30,50 。在df2 中AU-LOG帐户COMMON-PROD1 的成本是 10 美元所以这 10 美元将使用df1 中提到的百分比分为 3 个不同的 line_item_product_code

Product          line_item_product_code  cost      Account    percentage 
COMMON-PROD1     AWSCloudTrail           2         AU-LOG       20  
COMMON-PROD1     AWSGlue                 3         AU-LOG       30
COMMON-PROD1     AWSQueueService         5         AU-LOG       50
COMMON-PROD2     AWSCloudTrail           2.4       AU-LOG       20  
COMMON-PROD2     AWSGlue                 3.6       AU-LOG       30
COMMON-PROD2     AWSQueueService         6         AU-LOG       50
COMMON-PROD1     AWSSecretsManager       5.6       AU-PRD       40
COMMON-PROD1     AmazonDynamoDB          8.4       AU-PRD       60
COMMON-PROD2     AWSSecretsManager       6.4       AU-PRD       40
COMMON-PROD2     AmazonDynamoDB          9.6       AU-PRD       60

How can I achieve this using pandas?如何使用熊猫实现这一目标?

I think a simple merge should do, the multiply the cost column (which at this point contains the total product cost from df2) by the percentage / 100:我认为应该做一个简单的合并,将成本列(此时包含来自 df2 的总产品成本)乘以百分比 / 100:

>>> df = pd.merge(df1.drop(columns=['Product']), df2, on='Account', how='left')
>>> df['cost'] = df['cost'] * df['percentage'] / 100
>>> df
  line_item_product_code Account  percentage       Product  cost
0          AWSCloudTrail  AU-LOG          20  COMMON-PROD1   2.0
1          AWSCloudTrail  AU-LOG          20  COMMON-PROD2   2.4
2                AWSGlue  AU-LOG          30  COMMON-PROD1   3.0
3                AWSGlue  AU-LOG          30  COMMON-PROD2   3.6
4        AWSQueueService  AU-LOG          50  COMMON-PROD1   5.0
5        AWSQueueService  AU-LOG          50  COMMON-PROD2   6.0
6      AWSSecretsManager  AU-PRD          40  COMMON-PROD1   5.6
7      AWSSecretsManager  AU-PRD          40  COMMON-PROD2   6.4
8         AmazonDynamoDB  AU-PRD          60  COMMON-PROD1   8.4
9         AmazonDynamoDB  AU-PRD          60  COMMON-PROD2   9.6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM