For Pandas Dataframe is there a way to display same category together as one while retaining all the other values in string?
Assuming I have the following Scenario:
pd.DataFrame({"category": ['Associates', 'Manager', 'Associates', 'Associates', 'Engineer', 'Engineer', 'Manager', 'Engineer'],
"name": ['Abby', 'Jenny', 'Thomas', 'John', 'Eve', 'Danny', 'Kenny', 'Helen'],
"email": ['Abby@email.com', 'Jenny@email.com', 'Thomas@email.com', 'John@email.com', 'Eve@email.com', 'Danny@email.com', 'Kenny@email.com', 'Helen@email.com']})
How can I attempt to display the dataframe in a this way?
Output:
category name email
Associates Abby Abby@email.com
Thomas Thomas@email.com
John John@email.com
Manager Jenny Jenny@email.com
Kenny Kenny@email.com
Engineer Eve Eve@email.com
Danny Danny@email.com
Helen Helen@email.com
Any advise, or can it be done with groupby functions? Thanks!
It's not really clear to me what you mean by display . To get a print similar (not exactly) like the one you are showing you don't need .groupby()
. Just do
df = df.set_index(["category", "name"]).sort_index()
and get
email
category name
Associates Abby Abby@email.com
John John@email.com
Thomas Thomas@email.com
Engineer Danny Danny@email.com
Eve Eve@email.com
Helen Helen@email.com
Manager Jenny Jenny@email.com
Kenny Kenny@email.com
If you really want to modify the columns, then you could try something like
df = df.sort_values(["category", "name"], ignore_index=True)
df.loc[df["category"] == df["category"].shift(), "category"] = ""
to get
category name email
0 Associates Abby Abby@email.com
1 John John@email.com
2 Thomas Thomas@email.com
3 Engineer Danny Danny@email.com
4 Eve Eve@email.com
5 Helen Helen@email.com
6 Manager Jenny Jenny@email.com
7 Kenny Kenny@email.com
For this, you will have two line of codes: First, you need to set both your category
and name
as index
df.set_index(['category','name'],inplace=True)
Next, you will use groupby.sum
to get your desired output.
df.groupby(level=[0,1]).sum()
Out[67]:
email
category name
Associates Abby Abby@email.com
John John@email.com
Thomas Thomas@email.com
Engineer Danny Danny@email.com
Eve Eve@email.com
Helen Helen@email.com
Manager Jenny Jenny@email.com
Kenny Kenny@email.com
For this, you can use groupby()
function. Showing below is the sample code.
df.groupby(['category','name']).max()
Now the data is in indexed format and will be in the same format that you mentioned, if you want to remove the index, use the below code
df.groupby(['category','name']).max().reset_index()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.