简体   繁体   English

带有逗号分隔字符串条目的熊猫数据框,更改为唯一的逗号分隔条目

[英]pandas dataframe with comma separated string entries, change to unique comma separated entries

I have a pandas dataframe as such:我有一个熊猫数据框:

import pandas as pd
data = [["a,a,a", "b,b", "c,c,c"], ["d,d","e","fd"],["g,h,i", "g", "fg,h,a"]]
df = pd.DataFrame(data, columns = ["ColA","ColB","ColC"])

df

    ColA    ColB    ColC
0   a,a,a   b,b     c,c,c
1   d,d     e       fd
2   g,h,i   g       fg,h,a

I would like to reformat this table to:我想将此表重新格式化为:

    colA    colB    colC  
0   a       b       c
1   d       e       fd
2   g,h,i   g       fg,h,a

So unique entries after string splitting each entry by comma separated value.所以在用逗号分隔值字符串分割每个条目之后的唯一条目。

df.applymap(lambda elements: ','.join(set(elements.split(','))))

applymap() applies a function to all elements (cells) of a dataframe. applymap()将函数应用于数据帧的所有元素(单元格)。 The lambda function here first splits the data by , , then creates a set of all elements and concatenates them back with strings .join() method.此处的 lambda 函数首先按,拆分数据,然后创建所有元素的集合,并使用字符串.join()方法将它们连接回去。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM