繁体   English   中英

如何根据列名的前三个字符更改列名

[英]How to change column names based on the first three characters of the column name

我想使用字典根据列名的前三个字符更改列名。

这是我目前拥有的代码:

new_names = {"aud":"alc_aud","whe":"clu_whe", "per":"pre_per",
                "pol":"cou_pol","spec":"coc_spec","dark":"daw_dark"}

for x,y in new_names.items():
    if df.columns.str.startswith(x):
       df.columns = df.columns.str.replace(x,y)

我收到以下错误:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

用:

df = pd.DataFrame({'aud1':list('abcdef'),
                   'spe2':[4,5,4,5,5,4],
                   'C':[7,8,9,4,2,3],
                   'F':list('aaabbb')})

print (df)
  aud1   spe2  C  F
0    a      4  7  a
1    b      5  8  a
2    c      4  9  a
3    d      5  4  b
4    e      5  2  b
5    f      4  3  b

new_names = {"aud":"alc_aud","whe":"clu_whe", "per":"pre_per",
                "pol":"cou_pol","spec":"coc_spec","dark":"daw_dark"}

首先过滤字典的前 3 个值:

new_names = {k[:3] :v for k, v in new_names.items()}

print (new_names)
{'aud': 'alc_aud', 'whe': 'clu_whe', 'per': 'pre_per', 
     'pol': 'cou_pol', 'spe': 'coc_spec', 'dar': 'daw_dark'}

然后通过索引str[:3]选择前 3 个字母,然后replace dict replace

df.columns = df.columns.to_series().str[:3].replace(new_names)
print (df)
  alc_aud  coc_spec  C  F
0       a         4  7  a
1       b         5  8  a
2       c         4  9  a
3       d         5  4  b
4       e         5  2  b
5       f         4  3  b

使用list comprehension get另一种解决方案,如果值不匹配,则返回原始值:

df.columns = [new_names.get(x[:3], x) for x in df.columns]
print (df)
  alc_aud  coc_spec  C  F
0       a         4  7  a
1       b         5  8  a
2       c         4  9  a
3       d         5  4  b
4       e         5  2  b
5       f         4  3  b

编辑:Soluton 处理任何长度的字符串:

df = pd.DataFrame({'aud1':list('abcdef'),
                   'specd2':[4,5,4,5,5,4],
                   'podfds':[7,8,9,4,2,3],
                   'aaper':list('aaabbb')})

print (df)
  aud1  specd2  podfds aaper
0    a       4       7     a
1    b       5       8     a
2    c       4       9     a
3    d       5       4     b
4    e       5       2     b
5    f       4       3     b

new_names = {"aud":"alc_aud","whe":"clu_whe", "per":"pre_per",
                "po":"cou_pol","spec":"coc_spec","dark":"daw_dark"}

首先从 dict 的键开始extract所有值,然后map ,最后通过fillna填充不匹配的值:

pat = '|'.join([r'^{}'.format(x) for x in new_names])
s  = df.columns.to_series()
df.columns = s.str.extract('('+ pat + ')', expand=False).map(new_names).fillna(s)
print (df)
  alc_aud  coc_spec  cou_pol aaper
0       a         4        7     a
1       b         5        8     a
2       c         4        9     a
3       d         5        4     b
4       e         5        2     b
5       f         4        3     b

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM