简体   繁体   中英

How to calculate percentage of missing values

Edit: Apologies I actually missed out on an important grouping of data. Thanks for those who already helped.

I have a data set that has missing data. I have filled the missing values with 0. Using Python and Pandas I am trying to get to a metric for each team, the % of Apps they are working on that are complete. My thought was to groupby on ColA, then do counts on Col C, but I cant figure out how to get counts of complete and counts of total to do the calculation. Any ideas are much appreciated.

So I want something that looks like this

  Team A  App1 High 0%
  Team A  App3 Med  100%
  Team B  App2 Med  0%
  And so on. 

My df looks like the following

  +--------+-------+-------+----------+
  | Col A  | Col B | Col C |  Col D   |
  +--------+-------+-------+----------+
  | Team A | App1  | High  | 0        |
  | Team A | App1  | High  | 0        |
  | Team A | App3  | Med   | Complete |
  | Team B | App2  | Med   | 0        |
  | Team B | App2  | High  | Complete |
  | Team C | App1  | Low   | Complete |
  +--------+-------+-------+----------+
df['count'] = df.groupby(['Col A', 'Col B', 'Col C'])['Col D'].transform(lambda x: (x==0).sum())
df['share'] = df.groupby(['Col A', 'Col B', 'Col C'])['Col D'].transform(lambda x: '{:.2f}%'.format((x==0).sum()/len(x)*100))

yields:

      Col A    Col B    Col C       Col D count    share
0   Team A    App1     High             0     2  100.00%
1   Team A    App1     High             0     2  100.00%
2   Team A    App3     Med      Complete      0    0.00%
3   Team B    App2     Med              0     1  100.00%
4   Team B    App2     High     Complete      0    0.00%
5   Team C    App1     Low      Complete      0    0.00%

or just:

df.groupby(['Col A', 'Col B', 'Col C'])['Col D'].apply(lambda x: '{:.2f}%'.format((x==0).sum()/len(x)*100))

Col A     Col B    Col C  
 Team A    App1     High      100.00%
           App3     Med         0.00%
 Team B    App2     High        0.00%
                    Med       100.00%
 Team C    App1     Low         0.00%

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM