简体   繁体   English

如何在 df 与其他 df 连接的列中执行函数 sum()

[英]How execute fuction sum() in a column with df concate with other df

I am using concat to merge 5 equal df into one and get the total sum() of cost.我正在使用 concat 将 5 个相等的 df 合并为一个并获得总和()成本。

These values are not real, just an example of what df looks like这些值不是真实的,只是 df 的一个例子

What I tried:我尝试了什么:

import pandas as pd

g = {"id": "1515", "cost": "100"}
b = {"id": "1515", "cost": "100"}
f = {"id": "1515", "cost": "100"}
c = {"id": "1515", "cost": "100"}
o = {"id": "1515", "cost": "100"}

all_vendors = pd.concat([g, b, f, c, o])

Data types数据类型

all_vendors.dtypes

Campaign          object
campaignid       float64
Campaign_name     object
Cost              object
Month             object
Year & month      object
dtype: object

Attempts尝试

Attempt #1:尝试#1:

all_vendors.Cost.sum()

Results in:结果是:

TypeError: can only concatenate str (not "float") to str TypeError:只能将str(不是“float”)连接到str

Attempt #2:尝试#2:

all_vendors.Cost.astype(str)
all_vendors.Cost.sum()

Results in:结果是:

TypeError: can only concatenate str (not "float") to str TypeError:只能将str(不是“float”)连接到str

Attempt #3:尝试#3:

all_vendors.Cost.astype(float)
all_vendors.Cost.sum()

Results in:结果是:

ValueError: could not convert string to float: '100' ValueError:无法将字符串转换为浮点数:'100'

Your problem is that you're not reassigning your astype call back to your DataFrame :您的问题是您没有将astype调用重新分配给您的DataFrame

import pandas as pd

data = {
  "id": ['1,515','1,515','1,515','1,515','1,515'],
  "cost": ['1,000','1,000','1,000','1,000','1,000']
}

all_vendors = pd.DataFrame.from_dict(data)

all_vendors['cost'] = all_vendors.cost.str.replace(',','').astype(float)
print(all_vendors.cost.sum())
# Output: 500

As mentioned in the comments, use str.replace to remove any commas you have in your strings如评论中所述,使用str.replace删除字符串中的任何逗号

you first need to convert dataframe to float to be able to add numbers with decimals, for them you use DataFrame.astype您首先需要将 dataframe 转换为浮点数,以便能够添加带小数的数字,对于它们,您使用DataFrame.astype

import pandas as pd
g = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
b = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
f = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
c = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
o = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
all_vendors = pd.concat([g, b, f, c, o])

if you have ',' in your string, then you need:如果你的字符串中有',',那么你需要:

all_vendors['cost']=all_vendors['cost'].str.replace(',','')

Then you calculate the sum:然后计算总和:

all_vendors.astype(float).cost.sum()

Output: Output:

500.0

if you want to work with the float type data frame you need to assign it:如果要使用浮点类型数据框,则需要分配它:

all_vendors2=all_vendors.astype(float)
all_vendros2.cost.sum()

Output: Output:

500

I got this to work on my end with a value of 500:我得到了这个值500的工作:

df_list = [pd.DataFrame(data={"id": ["1515"], "cost": ["100"]}) for i in range(5)]
pd.concat(df_list).cost.astype(float).sum()

So long as they're dataframes and you convert the strings to floats, it looks good.只要它们是数据帧并且您将字符串转换为浮点数,它看起来就不错。

Check if this helps.This will give total of id's.检查这是否有帮助。这将给出 id 的总数。

import pandas as pd

g = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
b = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
f = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
c = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
o = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
all_vendors = pd.concat([g, b, f, c, o])

a=pd.DataFrame.from_records(all_vendors).astype(float).groupby('id').sum().T.to_dict()
print(a)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM