[英]pandas dataframe with comma separated string entries, change to unique comma separated entries
I have a pandas dataframe as such:我有一个熊猫数据框:
import pandas as pd
data = [["a,a,a", "b,b", "c,c,c"], ["d,d","e","fd"],["g,h,i", "g", "fg,h,a"]]
df = pd.DataFrame(data, columns = ["ColA","ColB","ColC"])
df
ColA ColB ColC
0 a,a,a b,b c,c,c
1 d,d e fd
2 g,h,i g fg,h,a
I would like to reformat this table to:我想将此表重新格式化为:
colA colB colC
0 a b c
1 d e fd
2 g,h,i g fg,h,a
So unique entries after string splitting each entry by comma separated value.所以在用逗号分隔值字符串分割每个条目之后的唯一条目。
df.applymap(lambda elements: ','.join(set(elements.split(','))))
applymap()
applies a function to all elements (cells) of a dataframe. applymap()
将函数应用于数据帧的所有元素(单元格)。 The lambda function here first splits the data by ,
, then creates a set of all elements and concatenates them back with strings .join()
method.此处的 lambda 函数首先按
,
拆分数据,然后创建所有元素的集合,并使用字符串.join()
方法将它们连接回去。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.