如何计算 excel 中条件单元格的总和，用结果填充另一列

Question

EDIT: Using advanced search in Excel (under data tab) I have been able to create a list of unique company names, and am now able to SUMIF based on the cell containing the companies name!编辑：在 Excel 中使用高级搜索（在数据选项卡下）我已经能够创建唯一公司名称的列表，现在可以根据包含公司名称的单元格进行 SUMIF ！

Disclaimer: Any python solutions would be greatly appreciated as well, pandas specifically!免责声明：任何 python 解决方案也将不胜感激，特别是 pandas！

I have 60,000 rows of data, containing information about grants awarded to companies.我有 60,000 行数据，其中包含有关授予公司的赠款的信息。

I am planning on creating a python dictionary to store each unique company name, with their total grant $ given (agreemen_2), and location coordinates.我正计划创建一个 python 字典来存储每个唯一的公司名称，以及给定的总授权美元（agreemen_2）和位置坐标。 Then, I want to display this using Dash (Plotly) on a live MapBox map of Canada.然后，我想在加拿大的实时 MapBox map 上使用 Dash (Plotly) 显示此内容。

First thing first, how do I calculate and store the total value that was awarded to each company?首先，我如何计算和存储授予每家公司的总价值？

I have seen SUMIF in other solutions, but am unsure how to output this to a new column, if that makes sense.我在其他解决方案中看到过 SUMIF，但如果有意义的话，我不确定如何将 output 放到一个新列中。

One potential solution I thought was to create a new column of unique company names, and next to it SUMIF all the appropriate cells in col D.我认为一个潜在的解决方案是创建一个新的唯一公司名称列，然后在它旁边 SUMIF 列 D 中的所有适当单元格。

PYTHON STUFF SO FAR PYTHON 东西到目前为止

So with the below code, I take a much messier looking spreadsheet, drop duplicates, sort based on company name, and create a new pandas database with the relevant data columns:因此，使用下面的代码，我使用了一个看起来更混乱的电子表格，删除重复项，根据公司名称进行排序，并创建一个新的 pandas 数据库，其中包含相关的数据列：

corp_df is the cleaned up new dataframe that I want to work with. corp_df 是我想使用的清理后的新 dataframe。

and recipien_4 is the companies unique ID number, as you can see it repeats with each grant awarded. recipien_4 是公司的唯一 ID 号，如您所见，它会随着每次授予的赠款而重复。 Folia Biotech in the screenshot shows a duplicate grant, as proven with a column i did not include in the screenshot.屏幕截图中的 Folia Biotech 显示了重复的赠款，正如我在屏幕截图中未包含的一列所证明的那样。 There are quite a few duplicates, as seen in the screenshot.如屏幕截图所示，有很多重复项。

import pandas as pd

in_file = '2019-20 Grants and Contributions.csv'

# create dataframe 
df = pd.read_csv(in_file)

# sort in order of agreemen_1
df.sort_values("recipien_2", inplace = True)

# remove duplicates
df.drop_duplicates(subset='agreemen_1', keep='first', inplace=True)

corp_dict = { }

# creates empty dict with only 1 copy of all corporation names, all values of 0
for name in corp_df_2['recipien_2']:
    if name not in corp_dict:
        corp_dict[name] = 0

# full name, id, grant $, longitude, latitude
corp_df = df[['recipien_2', 'recipien_4', 'agreemen_2','longitude','latitude']]

any tips or tricks would be greatly appreciated, .ittertuples() didn't seem like a good solution as I am unsure how to filter and compare data, or if datatypes are preserved.任何提示或技巧将不胜感激， .ittertuples() 似乎不是一个好的解决方案，因为我不确定如何过滤和比较数据，或者是否保留数据类型。 But feel free to prove me wrong haha.但是请随时证明我错了哈哈。

I thought perhaps there was a better way to tackle this problem, straight in Excel vs. iterating through rows of a pandas dataframe.我想也许有更好的方法来解决这个问题，直接在 Excel 中而不是在 pandas dataframe 的行中迭代。 This is a pretty open question so thank you for any help or direction you think is best!这是一个非常开放的问题，因此感谢您提供您认为最好的任何帮助或指导！

Answer 1

The use of group_by followed by a sum may be the best for you:使用group_by后跟sum可能最适合您：

corp_df= df.group_by(by=['recipien_2', 'longitude','latitude']).apply(sum, axis=1)

#if you want to transform the index into columns you can add this after as well:
corp_df=corp_df.reset_index()

Answer 2

I can see that you are using pandas to read de the file csv, so you can use the method:我可以看到您正在使用pandas来读取文件csv，因此您可以使用方法：

Group by

So you can create a new dataframe making groupings for the name of the company like this:因此，您可以创建一个新的 dataframe 为公司名称进行分组，如下所示：

dfnew = dp.groupby(['recipien_2','agreemen_2']).sum()

Then dfnew have the values.然后dfnew有值。

Documentation Pandas Group by: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html Documentation Pandas Group by: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html

如何计算 excel 中条件单元格的总和，用结果填充另一列

问题描述

2 个解决方案

解决方案1
1 2020-07-06 15:29:50

解决方案2
1 已采纳 2020-07-06 15:32:15

如何计算 excel 中条件单元格的总和，用结果填充另一列

问题描述

2 个解决方案

解决方案1 1 2020-07-06 15:29:50

解决方案2 1 已采纳 2020-07-06 15:32:15

解决方案1
1 2020-07-06 15:29:50

解决方案2
1 已采纳 2020-07-06 15:32:15