如何根据列名的前三个字符更改列名

Question

我想使用字典根据列名的前三个字符更改列名。

这是我目前拥有的代码：

new_names = {"aud":"alc_aud","whe":"clu_whe", "per":"pre_per",
                "pol":"cou_pol","spec":"coc_spec","dark":"daw_dark"}

for x,y in new_names.items():
    if df.columns.str.startswith(x):
       df.columns = df.columns.str.replace(x,y)

我收到以下错误：

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Answer 1

用：

df = pd.DataFrame({'aud1':list('abcdef'),
                   'spe2':[4,5,4,5,5,4],
                   'C':[7,8,9,4,2,3],
                   'F':list('aaabbb')})

print (df)
  aud1   spe2  C  F
0    a      4  7  a
1    b      5  8  a
2    c      4  9  a
3    d      5  4  b
4    e      5  2  b
5    f      4  3  b

new_names = {"aud":"alc_aud","whe":"clu_whe", "per":"pre_per",
                "pol":"cou_pol","spec":"coc_spec","dark":"daw_dark"}

首先过滤字典的前 3 个值：

new_names = {k[:3] :v for k, v in new_names.items()}

print (new_names)
{'aud': 'alc_aud', 'whe': 'clu_whe', 'per': 'pre_per', 
     'pol': 'cou_pol', 'spe': 'coc_spec', 'dar': 'daw_dark'}

然后通过索引str[:3]选择前 3 个字母，然后replace dict replace ：

df.columns = df.columns.to_series().str[:3].replace(new_names)
print (df)
  alc_aud  coc_spec  C  F
0       a         4  7  a
1       b         5  8  a
2       c         4  9  a
3       d         5  4  b
4       e         5  2  b
5       f         4  3  b

使用list comprehension get另一种解决方案，如果值不匹配，则返回原始值：

df.columns = [new_names.get(x[:3], x) for x in df.columns]
print (df)
  alc_aud  coc_spec  C  F
0       a         4  7  a
1       b         5  8  a
2       c         4  9  a
3       d         5  4  b
4       e         5  2  b
5       f         4  3  b

编辑：Soluton 处理任何长度的字符串：

df = pd.DataFrame({'aud1':list('abcdef'),
                   'specd2':[4,5,4,5,5,4],
                   'podfds':[7,8,9,4,2,3],
                   'aaper':list('aaabbb')})

print (df)
  aud1  specd2  podfds aaper
0    a       4       7     a
1    b       5       8     a
2    c       4       9     a
3    d       5       4     b
4    e       5       2     b
5    f       4       3     b

new_names = {"aud":"alc_aud","whe":"clu_whe", "per":"pre_per",
                "po":"cou_pol","spec":"coc_spec","dark":"daw_dark"}

首先从 dict 的键开始extract所有值，然后map ，最后通过fillna填充不匹配的值：

pat = '|'.join([r'^{}'.format(x) for x in new_names])
s  = df.columns.to_series()
df.columns = s.str.extract('('+ pat + ')', expand=False).map(new_names).fillna(s)
print (df)
  alc_aud  coc_spec  cou_pol aaper
0       a         4        7     a
1       b         5        8     a
2       c         4        9     a
3       d         5        4     b
4       e         5        2     b
5       f         4        3     b

如何根据列名的前三个字符更改列名

问题描述

1 个解决方案

解决方案1
1 已采纳 2018-09-02 17:59:54

如何根据列名的前三个字符更改列名

问题描述

1 个解决方案

解决方案1 1 已采纳 2018-09-02 17:59:54

解决方案1
1 已采纳 2018-09-02 17:59:54