简体   繁体   English

使用函数重命名多个熊猫数据框列名

[英]Rename Multiple pandas Dataframe Column Names using function

Trying to rename pandas dataframe column using regex, I know how to do it rename the list as per below but could not get success results with df.rename.尝试使用正则表达式重命名熊猫数据框列,我知道如何按照下面的方法重命名列表,但无法使用 df.rename 获得成功结果。

Input:输入:

df.columns.values = ['Time', '101 <RoomTemperature> (C)', '102 <ChemberTemperature> (C)', '103 <U1100> (C)', '103 <U1200 (C)', '103 U1500> (C)']

Trials of the Renaming dataframe column as per below code using regex but it does not work.按照下面的代码使用正则表达式尝试重命名数据框列,但它不起作用。 I could not think of how to put multiple instruction together in df.rename method.我想不出如何在 df.rename 方法中将多条指令放在一起。

df.rename(columns={c: c.strip() for c in df.columns.values.tolist()
                                if "<" and ">" in c: 
                  re.search(r"(?<=<).*(?=>)",c).group(0)}, inplace=True)

I want it to follow regex and rename it to as per below:我希望它遵循正则表达式并将其重命名为如下所示:

df.columns.values = ["Time", "RoomTemperature", "ChemberTemperature", "U1100", "103 <U1200 (C)", "103 U1500> (C)"]

You could extract the functionality into a function and do the following:您可以将功能提取到函数中并执行以下操作:

import re
import pandas as pd

# sample data 
df = pd.DataFrame(
    columns=['Time', '101 <RoomTemperature> (C)', '102 <ChemberTemperature> (C)', '103 <U1100> (C)', '103 <U1200 (C)',
             '103 U1500> (C)'])


# replacement function 
def repl(name):
    match = re.search(r"<(.*?)>", name)
    return match.group(1) if match else name


df.rename(columns={c: repl(c.strip()) for c in df.columns}, inplace=True)

print(df.columns)

Output输出

Index(['Time', 'RoomTemperature', 'ChemberTemperature', 'U1100',
       '103 <U1200 (C)', '103 U1500> (C)'],
      dtype='object')

That being said, you also need to fix your regular expression.话虽如此,您还需要修复正则表达式。

You can use regular expressions to extract the matching group as per your requirements and then you can use DataFrame.rename to alter the column labels.您可以使用正则表达式根据您的要求提取匹配组,然后您可以使用DataFrame.rename来更改列标签。

Try this:尝试这个:

import re

col_dict = {}
for col in df.columns:
    mobj = re.search(r"\<(.*?)\>", col)
    if mobj:
        col_dict[col] = mobj.group(1)

df.rename(columns=col_dict, inplace=True)

After renaming df.columns will be:重命名df.columns后将是:

['Time', 'RoomTemperature', 'ChemberTemperature', 'U1100', '103 <U1200 (C)', '103 U1500> (C)']

Another solution, regex can have some unfriendly look about it, despite its power:另一个解决方案,正则表达式可能会有一些不友好的外观,尽管它很强大:

columns = ['Time', '101 <RoomTemperature> (C)', '102 <ChemberTemperature> (C)', '103 <U1100> (C)', '103 <U1200 (C)', '103 U1500> (C)']
df = pd.DataFrame([[1,2,3,4,5,6]],columns=columns)

   p = re.compile(r'((?<=<).*?(?=>))')

  #create a dict for the replacement

  replace_dict = {w:p.search(w).group() for w in df.columns if p.search(w)}

 #pass dictionary into rename method

 df.rename(columns=replace_dict)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM