简体   繁体   English

列的条件和 python pandas

[英]Conditional Sum of a column python pandas

I'm very new to python and pandas and was looking for some help.我对 python 和 pandas 很陌生,正在寻求帮助。 I'm working off of a CSV, and trying to use pandas to calculate totals for each name based on the value of the middle column.我正在使用 CSV,并尝试使用 pandas 根据中间列的值计算每个名称的总数。 I want the sum of 'count' for each name for 'GEN' and 'NPR' added together.我希望将“GEN”和“NPR”的每个名称的“count”总和加在一起。 This is my dataset as a CSV:这是我作为 CSV 的数据集:

StartingCSV.csv:开始CSV.csv:

Name, Specialty, Count
Smith, GEN, 1
Smith, INT, 2
Smith, NPR, 5
Smith, PSC, 4
Zane, GEN, 3
Zane, PSC, 4
Zane, NPR, 4
Charles, NPR, 4
Charles, AUD, 4

Desired output:所需的 output:

Smith: 6
Zane: 7
Charles: 4

This is what I have so far:这是我到目前为止所拥有的:

import csv
import pandas as pd
df = pd.read_csv("StartingCSV.csv")
newdf = df.groupby("Name")
newdf.apply(lambda x: x[x['Specialty'] == 'NPR']['Count'].sum())

This is the output I get:这是我得到的 output:

Smith: 5
Zane: 4
Charles: 4

This returns the number of NPR for each name, but I can't figure out a way to ADD the NPR values to the GEN values for each name, to create the desired output as listed above.这将返回每个名称的 NPR 数量,但我无法找到将 NPR 值添加到每个名称的 GEN 值的方法,以创建所需的 output,如上所示。 Trying to add an "or" after 'NPR' in the final line returns an error like this:尝试在最后一行的“NPR”之后添加“或”会返回如下错误:

newdf.apply(lambda x: x[x['Specialty'] == 'NPR' or 'GEN']['Count'].sum())

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Any help is appreciated!任何帮助表示赞赏! (sorry if I'm way off base or if this is unclear) (对不起,如果我离基地很远或者不清楚)

You can do it this way你可以这样做

df[df['Specialty'].isin(['NPR','GEN'])].groupby('Name').sum().reset_index()

with df[df['Specialty'].isin(['NPR','GEN'])] we are getting only those rows of the dataframe which have the value 'NPR' or 'GEN' in column 'Specialty'.使用df[df['Specialty'].isin(['NPR','GEN'])]我们只得到 dataframe 中在“Specialty”列中具有值“NPR”或“GEN”的那些行。 After that it is the usual groupby and sum之后就是通常的groupbysum

Output Output

Name    Count
Charles   4
Smith     6
Zane      7

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM