遍历字符串列并对 Pandas 中的单元格值进行排序

Question

Suppose we have the following dataframe:假设我们有以下 dataframe：

d = {'col1':['cat; banana','kiwi; orange; apple','melon'],
    'col2':['a; d; c','p; u; c','m; a'],
    'col3':[4,1,4]}
df= pd.DataFrame(d)

for all the string columns I want to sort the values alphabetically, I know how to do this column by column, namely:对于我想按字母顺序对值进行排序的所有字符串列，我知道如何逐列执行此操作，即：

df['col1'] = df['col1'].map(lambda x: '; '.join(sorted(x.split('; '))))

and similarly for col2 I wonder how one can does this for the whole dataframe?同样对于col2 ，我想知道如何为整个 dataframe 做到这一点？ I tried to select the string objects and do the map method, but it didn't work.我试图 select 字符串对象并执行 map 方法，但它没有用。 Namely:即：

df.select_dtypes(include='object').map(lambda x: '; '.join(sorted(x.split('; '))))

Update: So an inefficient way of doing this would be:更新：所以这样做的一种低效方法是：

v = df.select_dtypes(include='object').applymap(lambda x: '; '.join(sorted(x.split('; '))))
w = df.select_dtypes(exclude='object')
pd.concat([v, w], axis=1)

But I am sure there are better ways.但我相信还有更好的方法。

Answer 1

I would do this in an inefficient for loop with a test to make sure that you are not applying it to the ints我会在低效的 for 循环中执行此操作，并进行测试以确保您没有将其应用于整数

for col in df.columns:
    if df[col].dtypes is 'str':
        df[col] = df[col].map(lambda x: '; '.join(sorted(x.split('; '))))

there maybe a better vectorized way也许有更好的矢量化方式

Answer 2

You can use this trick (unpacking a dataframe and using pd.DataFrame.assign ):您可以使用此技巧（解压缩 dataframe 并使用pd.DataFrame.assign ）：

df.assign(**df.select_dtypes(include='object').applymap(lambda x: '; '.join(sorted(x.split('; ')))))

Output: Output：

                  col1     col2  col3
0          banana; cat  a; c; d     4
1  apple; kiwi; orange  c; p; u     1
2                melon     a; m     4

遍历字符串列并对 Pandas 中的单元格值进行排序

问题描述

2 个解决方案

解决方案1
1 2021-01-10 21:46:56

解决方案2
1 已采纳 2021-01-10 22:45:40

遍历字符串列并对 Pandas 中的单元格值进行排序

问题描述

2 个解决方案

解决方案1 1 2021-01-10 21:46:56

解决方案2 1 已采纳 2021-01-10 22:45:40

解决方案1
1 2021-01-10 21:46:56

解决方案2
1 已采纳 2021-01-10 22:45:40