For example, I have a dataframe of four people, divided into group A and B. Now I want to filter out the group B, divide their Point by half, and assign the result to a new column named as 'Point_new'.
I am using the codes like the following
import pandas as pd
data = {'Name':['Tom', 'Nick', 'Krish', 'Jack'],
'Group':['A', 'B', 'B', 'A'],
'Point':[20, 21, 19, 18]}
df = pd.DataFrame(data)
df['Point_new'] = ''
df[df['Group']=='B']['Point_new'] = df[df['Group']=='B']['Point'] / 2
From output of the above codes, the Point_new column is not filled with the calculation result. I wonder why is that, and how could I do it properly.
You can do it like this, the key is to use loc -
import pandas as pd
data = {'Name':['Tom', 'Nick', 'Krish', 'Jack'],
'Group':['A', 'B', 'B', 'A'],
'Point':[20, 21, 19, 18]}
df = pd.DataFrame(data)
df.loc[(df['Group'] == 'B'),'Point_new'] = df.loc[df['Group']=='B','Point'] / 2
print(df)
Name Group Point Point_new
0 Tom A 20 NaN
1 Nick B 21 10.5
2 Krish B 19 9.5
3 Jack A 18 NaN
It's because you're trying to set a value on a slice of a DataFrame. See more info here: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
Try this:
df.loc[df['Group']=='B', 'Point_new'] = df[df['Group']=='B']['Point'] / 2
Which yields:
Name | Group | Point | Point_new | |
---|---|---|---|---|
0 | Tom | A | 20 | |
1 | Nick | B | 21 | 10.5 |
2 | Krish | B | 19 | 9.5 |
3 | Jack | A | 18 |
You can try:
df = df.assign(Point_New = df[df.Group.eq('B')].Point.div(2))
Name Group Point Point_New
0 Tom A 20 NaN
1 Nick B 21 10.5
2 Krish B 19 9.5
3 Jack A 18 NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.