I am very new to both pandas and python. I did find this
Pandas Count Unique Occurances by Month
on stackoverflow that is similar to my question, but I can't seem to manipulate the answer to address my problem so here goes:
I have a list of servers, and each server has a number of patches that apply to that server. The excel looks a bit like this:
Server Name - Missing Patch
Server 1 ---- file
Server 1 ---- reg
Server 1 ---- file
Server 2 ---- reg
Server 2 ---- file
Server 2 ---- file
Server 2 ---- file
Server 3 ---- reg
Server 3 ---- reg
Server 3 ---- file
Server 3 ---- reg
Server 3 ---- unicorns
I would like to make a dataframe that has one column of unique servers and one column counting how many missing patches there are for each server. It should look like this:
Server Name - Missing Patch Count
Server 1 ---- 3
Server 2 ---- 4
Server 3 ---- 5
I can't seem to make nunique work, and value_counts returns a series, which isn't quite what I want. I managed to get df.drop_duplicates('Server Name')
to work to only display unique instances of server names, but I can't get it to sort the column by that data.
I realise there have been a lot of similar questions on here, but I am new to pandas and can't seem to apply the right syntax to my problem using the documentation - can anyone help?
You can groupby
'Server Name', call count
and then reset_index
to restore the server name column:
In [330]:
df.groupby('Server Name').count().reset_index()
Out[330]:
Server Name Missing Patch
0 Server 1 3
1 Server 2 4
2 Server 3 5
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.