简体   繁体   English

根据条件重命名 Pandas 数据框的多列

[英]rename multiple columns of pandas dataframe based on condition

I have a df in which I need to rename 40 column names to empty string.我有一个 df,我需要将 40 个列名重命名为空字符串。 this can be achieved by using .rename() , but I need to provide all the column names in dict, which needs to be renamed.这可以通过使用.rename()来实现,但我需要在需要重命名的 dict 中提供所有列名。 I was searching for some better way to rename columns by some pattern matching.我正在寻找一些更好的方法来通过一些模式匹配来重命名列。 wherever it finds NULL/UNNAMED in column name, replace that with empty string.在列名中找到 NULL/UNNAMED 的任何地方,将其替换为空字符串。

df1: original df (In actual df, i have around 20 columns as NULL1-NULL20 and 20 columns as UNNAMED1-UNNAMED20) df1:原始 df(在实际的 df 中,我有大约 20 列作为 NULL1-NULL20 和 20 列作为 UNNAMED1-UNNAMED20)

    NULL1   NULL2   C1  C2  UNNAMED1    UNNAMED2
0   1   11  21  31  41  51
1   2   22  22  32  42  52
2   3   33  23  33  43  53
3   4   44  24  34  44  54

desired output df:所需的输出 df:

            C1  C2      
0   1   11  21  31  41  51
1   2   22  22  32  42  52
2   3   33  23  33  43  53
3   4   44  24  34  44  54

This can be achieved by这可以通过

df.rename(columns={'NULL1':'', 'NULL2':'', 'UNNAMED1':'', 'UNNAMED2':''}, inplace=True)

But I dont want to create the long dictionary of 40 elements但我不想创建 40 个元素的长字典

Is it possible, but be carefull - then if need select one empty column get all empty columns, because duplicated columns names:是否可能,但要小心 - 如果需要选择一个空列,则获取所有空列,因为列名称重复:

print (df[''])

0  1  11  41  51
1  2  22  42  52
2  3  33  43  53
3  4  44  44  54

Use startswith for get all columns by tuples in list comprehension:使用startswith获取列表理解startswith组的所有列:

df.columns = ['' if c.startswith(('NULL','UNNAMED')) else c for c in df.columns]

Your solution should be changed:您的解决方案应该改变:

d = dict.fromkeys(df.columns[df.columns.str.startswith(('NULL','UNNAMED'))], '')
print (d)
{'NULL1': '', 'NULL2': '', 'UNNAMED1': '', 'UNNAMED2': ''}
df = df.rename(columns=d)

If you want to stick with rename :如果你想坚持rename

def renaming_fun(x):
    if "NULL" in x or "UNNAMED" in x:
        return "" # or None
    return x

df = df.rename(columns=renaming_fun)

It can be handy if the mapping function gets more complex.如果映射函数变得更复杂,它会很方便。 Otherwise, list comprehensions will do:否则,列表推导式将执行以下操作:

df.columns = [renaming_fun(col) for col in cols]

Another possibility:另一种可能:

df.columns = map(renaming_fun, df.columns)

But as it was already mentioned, renaming with empty strings is not something you would usually do.但正如已经提到的,用空字符串重命名不是你通常会做的事情。

If you have few columns to retain their name.如果您有几列保留其名称。 Use list-comprehension as below:使用list-comprehension如下:

df.columns = [col if col in ('C1','C2') else "" for col in df.columns]
df.columns = [col if “NULL” not in col else “” for col in df.columns]

这应该有效,因为您可以通过将列表分配给数据框列变量来更改列名称。

You can use dict comprehension inside df.rename():您可以在 df.rename() 中使用字典理解:

idx_filter = np.asarray([i for i, col in enumerate(df.columns) if SOME_STRING_CONDITION in col])
df.rename(columns={col: '' for col in df.columns[idx_filter]}, inplace=True)

In your case, it sounds like SOME_STRING_CONDITION would be 'NULL' or 'UNNAMED'.在您的情况下,听起来 SOME_STRING_CONDITION 将是“NULL”或“UNNAMED”。

I figured this out while looking for help on a thread for a more generic column renaming issue ( Renaming columns in pandas ) for a problem of my own.我在为一个更通用的列重命名问题( 在 Pandas 中重命名列)的线程上寻找帮助时发现了这一点,以解决我自己的问题。 I didn't have enough reputation to add my solution as an answer or comment (I'm new-ish on stackoverflow), so I am posting it here!我没有足够的声誉来添加我的解决方案作为答案或评论(我是 stackoverflow 上的新手),所以我在这里发布它!

This solution is also helpful if you need to keep part of the string that you were filtering for.如果您需要保留要过滤的字符串的一部分,此解决方案也很有帮助。 For example, if you wanted to replace the "C" columns with "col_":例如,如果您想用“col_”替换“C”列:

idx_filter = np.asarray([i for i, col in enumerate(df.columns) if 'C' in col])
df.rename(columns={col: col.replace('C', 'col_') for col in df.columns[idx_filter]}, inplace=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM