Given this data frame:
d = {'SITE':['AB', 'ON', 'YO', 'YO', 'AB'],
'MARK':['ss', 'ss', 'tt', 'ss', 'tt'],
'SIZE': [4, 5, 2, 3, 4]}
ex_df = pd.DataFrame(data=d)
To get the column['SIZE']
sum for only the column['SITE'] == 'AB'
one can slice the AB containing only df using AB_df = ex_df[ex_df.SITE == 'AB']
and then AB_df.SIZE.sum()
, which is 8
.
However, given a similar data frame with 10,000+ rows and 12 columns, and over 40 unique column['SITE']
strings.
Q1: How can you get the SIZE
sum for each SITE
without having to write 40 lines of the same code as above (changing the SITE
name).
Q2: How can you add more conditions, such as check that if a condition matches two columns, the SITE & MARK
, and then get the SIZE
sum, without having to again write 40 lines of repetitive code.
I'd like to save the result either in a list containing the sums or dictionary with the site and sum {AB:8, ON:5, ...}
or even a new data frame with that information.
I've tried to use a list
of the 40 unique sites
to iterate through the data frame column, but without success given length differences, etc.
I'm looking to make this pythonic ideally. Thanks!
Q1 Can be accomplished with a groupby in Pandas:
grouped_df = ex_df.groupby('SITE').agg({'SIZE': 'sum'}
To accomplish Q2, you likely need to implement a custom function to pass to the .agg
call, something like:
def my_filter(df: pd.Dataframe):
# Filters can be modified as needed
return df[df['SITE'].startswith('A') & df['MARK'] == 'tt']['SIZE'].sum()
grouped_df = ex_df.groupby('SITE').agg(my_filter)
However, if your goal for Q2 is simply to group the rows by SITE AND MARK, you can do:
grouped_df = ex_df.groupby(['SITE', 'MARK']).agg({'SIZE': 'sum'})
Then you don't have to worry about writing a custom filtering function.
IIUC this should give you a sum of each SITE for each row:
ex_df['Max'] = ex_df.groupby(['SITE'])['SIZE'].transform(sum)
If not please clarify further for further assistance.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.