I have a dataframe with these values:
filename, keyword, page
A, red, 1
A, red, 2
A, green, 1
B, red, 1
B, green, 1
C, green, 2
How can I transform this to the following format?
filename, keywords, pages
A, [red, green], [1,2]
B, [red, green], [1]
C, [green], [2]
Is there an easy way to do this in Pandas? If a list isn't allowed as a cell value, is there another datatype that I could use that Pandas would allow? Or an alternative to a Pandas dataframe that I could store this in and then save it to a csv?
you could use df.groupby(["filename"])['keyword','page'].agg(set)
keyword page
filename
A {green, red} {1, 2}
B {green, red} {1}
C {green} {2}
( PS: updated based on Ch3steR answers, i was only using list instead of set
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.