I have a dataframe as follow:
c1 c2 c3 c4 c5 c6 c7
0 li 1 2 1 3 2 4
1 qian 2 3 3 5 4 2
2 qian 3 5 4 3 2 4
3 li 5 23 23 2 5 2
4 li 2 5 1 4 2 4
5 zhou 3 5 1 1 1 2
I am trying to create a new column c8 that returns the grouped mean. The group method is:
groupby('c1')['c2'].transform('mean') ---c2 can be replaced by c3 to c7
My current code looks as below:
lst = [c1, c2, c3, c4,c5, c6, c7]
for i in range(len(lst)):
res = df.groupby(df['c1'])[i].transform('mean')
return res
df['c8'] = df[res]
The error says it cannot find c1. Can anyone tell me how do I generate the grouped mean and make this loop works?
There's a few problems here:
The error you're receiving is because you've put variables in your list lst
. These should be strings (surrounded by quotes)
You're iterating over the index of lst
not the items of lst
itself- eg for each iteration of your for-loop, your iterator i
is 1
then 2
then 3
, not "c1"
"c2"
"c3"
You have a return
statement inside of your for-loop
. There are almost 0 reasons to ever put a return statement in a for-loop because it stops the loop entirely.
You can simply update the dataframe on each iteration of the loop, instead of storing it into a temporary res
variable
A working example of your for-loop method would look like this
lst = ["c2", "c3", "c4", "c5", "c6", "c7"]
for column in lst:
df[column] = df.groupby("c1")[column].transform('mean')
print(df)
c1 c2 c3 c4 c5 c6 c7
0 li 2.666667 10 8.333333 3 3 3.333333
1 qian 2.500000 4 3.500000 4 3 3.000000
2 qian 2.500000 4 3.500000 4 3 3.000000
3 li 2.666667 10 8.333333 3 3 3.333333
4 li 2.666667 10 8.333333 3 3 3.333333
5 zhou 3.000000 5 1.000000 1 1 2.000000
Even better though, you can supply all of the columns you want to calculate the mean of at once without having to explicitly loop:
lst = ["c2", "c3", "c4", "c5", "c6", "c7"]
average_df = df.groupby("c1")[lst].transform("mean")
print(average_df)
c2 c3 c4 c5 c6 c7
0 2.666667 10.0 8.333333 3.0 3.0 3.333333
1 2.500000 4.0 3.500000 4.0 3.0 3.000000
2 2.500000 4.0 3.500000 4.0 3.0 3.000000
3 2.666667 10.0 8.333333 3.0 3.0 3.333333
4 2.666667 10.0 8.333333 3.0 3.0 3.333333
5 3.000000 5.0 1.000000 1.0 1.0 2.000000
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.