简体   繁体   中英

Divide dataframe column value by the total of the column

My question might be too easy for many of you but since i'm a beginner with Python..

I want to have the % by value of a column containing 3 different possible values (1,0,-1) but by excluding one of the values in the column (which is -1).

I did this: (df['col_name']).sum()/len(df.col_name)

However it also counts the -1 in it, while i just want to have the % of the value 1/total sum, but without -1 in the total sum.

Thank you for your help.

For exclude values replace -1 to missing values:

df['col_name'].replace(-1, np.nan).sum()/len(df.col_name) 

Or filter out -1 values if need count lengths of filtered Series:

np.random.seed(123)
df = pd.DataFrame({'col_name':np.random.choice([0,1,-1], size=10)})

print (df)
   col_name
0        -1
1         1
2        -1
3        -1
4         0
5        -1
6        -1
7         1
8        -1
9         1

s = df.loc[df['col_name'] != -1, 'col_name']
print (s)
1    1
4    0
7    1
9    1
Name: col_name, dtype: int32

print (s.sum()/len(s))
0.75

print (s.mean())
0.75

Assuming you have this dataframe

df = pd.DataFrame({
    'col_name': [1,1,0,-1,-1,1,0]
    })

    col_name
0   1
1   1
2   0
3   -1
4   -1
5   1
6   0

You would like to count the number of 1's divided by total numbers without -1's, which is 3 out of 5, correct?

numerator = sum(df['col_name'].apply(lambda x: 1 if x==1 else 0))
denominator = sum(df['col_name'].apply(lambda x: 0 if x==-1 else 1))
print(numerator/denominator)

Output 0.6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM