简体   繁体   中英

Pandas dataframe cdf of a column value with condition

OK I have the following dataframe.

import pandas as pd
import bumpy as np
import seaboard as sns

df = pd.DataFrame(np.random.randint(0,100,size=(100,)), columns=['marks'])

Then I can plot the distribution of marks overall like this:

sns.displot(data=df, x='length', kind='ecdf', hue='class')

Output:

在此处输入图像描述

Now I want a CDF of marks 45 and above.

sns.displot(data=df, x=df['marks']>45, kind='ecdf')

在此处输入图像描述

I must be doing this wrong. What I am missing?

When you use df["marks"] > 45 , pandas will return this:

0      True
1      True
2      True
3     False
4     False
      ...  
95    False
96     True
97    False
98     True
99    False

You can make this:

new_df = df[(df["marks"] > 45)]
sns.displot(data=new_df, kind='ecdf')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM