简体   繁体   中英

For each distinct value in a given column, count the null and non-null values in another column

Suppose I have the following dataframe:

df = pd.DataFrame({'col1':['x','y','z','x','x','x','y','z','y','y'],
                'col2':[np.nan,'n1',np.nan,np.nan,'n3','n2','n5',np.nan,np.nan,np.nan]})

for each distinct element in col1 I want to count how may null and non-null value are there in col2 and summarise the result in a new dataframe. So far I used df1 = df[df['col1']=='x'] and then

print(df1[df1['col2'].isna()].shape[0],
df1[df1['col2'].notna()].shape[0])

I was then manually changing the value in df1 so that df1 = df[df['col1']=='y'] and df1 = df[df['col1']=='z'] . Yet my method is not efficient at all. The table I desire should look like the following:

  col1  value  no value
0    x      2         2
1    y      2         2
2    z      0         2

I have also tried df.groupby('col1').col2.nunique() yet that only gives me result when there is non-null value.

Let us try crosstab to create a frequency table where the index is the unique values in column col1 and columns represent the corresponding counts of non-nan and nan values in col2 :

out = pd.crosstab(df['col1'], df['col2'].isna())
out.columns = ['value', 'no value']

>>> out

      value  no value
col1                 
x         2         2
y         2         2
z         0         2

Use SeriesGroupBy.value_counts with SeriesGroupBy.value_counts for counts with reshape by Series.unstack and some data cleaning:

df = (df['col2'].isna()
                .groupby(df['col1'])
                .value_counts()
                .unstack(fill_value=0)
                .reset_index()
                .rename_axis(None, axis=1)
                .rename(columns={False:'value', True:'no value'}))
print (df)
  col1  value  no value
0    x      2         2
1    y      2         2
2    z      0         2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM