如何在 df 与其他 df 连接的列中执行函数 sum()

Question

I am using concat to merge 5 equal df into one and get the total sum() of cost.我正在使用 concat 将 5 个相等的 df 合并为一个并获得总和（）成本。

These values are not real, just an example of what df looks like这些值不是真实的，只是 df 的一个例子

What I tried:我尝试了什么：

import pandas as pd

g = {"id": "1515", "cost": "100"}
b = {"id": "1515", "cost": "100"}
f = {"id": "1515", "cost": "100"}
c = {"id": "1515", "cost": "100"}
o = {"id": "1515", "cost": "100"}

all_vendors = pd.concat([g, b, f, c, o])

Data types数据类型

all_vendors.dtypes

Campaign          object
campaignid       float64
Campaign_name     object
Cost              object
Month             object
Year & month      object
dtype: object

Attempts尝试

Attempt #1:尝试#1：

all_vendors.Cost.sum()

Results in:结果是：

TypeError: can only concatenate str (not "float") to str TypeError：只能将str（不是“float”）连接到str

Attempt #2:尝试#2：

all_vendors.Cost.astype(str)
all_vendors.Cost.sum()

Results in:结果是：

TypeError: can only concatenate str (not "float") to str TypeError：只能将str（不是“float”）连接到str

Attempt #3:尝试#3：

all_vendors.Cost.astype(float)
all_vendors.Cost.sum()

Results in:结果是：

ValueError: could not convert string to float: '100' ValueError：无法将字符串转换为浮点数：'100'

Answer 1

Your problem is that you're not reassigning your astype call back to your DataFrame :您的问题是您没有将astype调用重新分配给您的DataFrame ：

import pandas as pd

data = {
  "id": ['1,515','1,515','1,515','1,515','1,515'],
  "cost": ['1,000','1,000','1,000','1,000','1,000']
}

all_vendors = pd.DataFrame.from_dict(data)

all_vendors['cost'] = all_vendors.cost.str.replace(',','').astype(float)
print(all_vendors.cost.sum())
# Output: 500

As mentioned in the comments, use str.replace to remove any commas you have in your strings如评论中所述，使用str.replace删除字符串中的任何逗号

Answer 2

you first need to convert dataframe to float to be able to add numbers with decimals, for them you use DataFrame.astype您首先需要将 dataframe 转换为浮点数，以便能够添加带小数的数字，对于它们，您使用DataFrame.astype

import pandas as pd
g = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
b = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
f = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
c = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
o = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
all_vendors = pd.concat([g, b, f, c, o])

if you have ',' in your string, then you need:如果你的字符串中有'，'，那么你需要：

all_vendors['cost']=all_vendors['cost'].str.replace(',','')

Then you calculate the sum:然后计算总和：

all_vendors.astype(float).cost.sum()

Output: Output：

500.0

if you want to work with the float type data frame you need to assign it:如果要使用浮点类型数据框，则需要分配它：

all_vendors2=all_vendors.astype(float)
all_vendros2.cost.sum()

Output: Output：

Answer 3

I got this to work on my end with a value of 500:我得到了这个值500的工作：

df_list = [pd.DataFrame(data={"id": ["1515"], "cost": ["100"]}) for i in range(5)]
pd.concat(df_list).cost.astype(float).sum()

So long as they're dataframes and you convert the strings to floats, it looks good.只要它们是数据帧并且您将字符串转换为浮点数，它看起来就不错。

Answer 4

Check if this helps.This will give total of id's.检查这是否有帮助。这将给出 id 的总数。

import pandas as pd

g = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
b = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
f = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
c = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
o = pd.DataFrame({"id": ["1515"], "cost": ["100"]})
all_vendors = pd.concat([g, b, f, c, o])

a=pd.DataFrame.from_records(all_vendors).astype(float).groupby('id').sum().T.to_dict()
print(a)

如何在 df 与其他 df 连接的列中执行函数 sum()

问题描述

4 个解决方案

解决方案1
2 已采纳 2019-10-18 22:10:13

解决方案2
1 2019-10-18 22:08:00

解决方案3
1 2019-10-18 22:08:34

解决方案4
1 2019-10-18 22:12:20

如何在 df 与其他 df 连接的列中执行函数 sum()

问题描述

4 个解决方案

解决方案1 2 已采纳 2019-10-18 22:10:13

解决方案2 1 2019-10-18 22:08:00

解决方案3 1 2019-10-18 22:08:34

解决方案4 1 2019-10-18 22:12:20

解决方案1
2 已采纳 2019-10-18 22:10:13

解决方案2
1 2019-10-18 22:08:00

解决方案3
1 2019-10-18 22:08:34

解决方案4
1 2019-10-18 22:12:20