简体   繁体   中英

How do I create a list of values in a column from several values from another column in a pandas dataframe?

I have a dataframe with these values:

filename, keyword, page
A, red, 1
A, red, 2
A, green, 1
B, red, 1
B, green, 1
C, green, 2

How can I transform this to the following format?

filename, keywords, pages
A, [red, green], [1,2]
B, [red, green], [1]
C, [green], [2]

Is there an easy way to do this in Pandas? If a list isn't allowed as a cell value, is there another datatype that I could use that Pandas would allow? Or an alternative to a Pandas dataframe that I could store this in and then save it to a csv?

you could use df.groupby(["filename"])['keyword','page'].agg(set)

keyword page
filename        
A   {green, red}    {1, 2}
B   {green, red}    {1}
C   {green} {2}

( PS: updated based on Ch3steR answers, i was only using list instead of set

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM