简体   繁体   English

我可以命名 2 个与前一个相同的后续列吗? (Python)

[英]Can I name 2 subsequent columns identical like the previous one? (python)

I have a dataset in Excel and some columns are unnamed there: A_ColumnWithName, Unnamed1, Unnamed2, B_ColumnWithName, Unnamed3, Unnamed4我在 Excel 中有一个数据集,其中一些列未命名:A_ColumnWithName、Unnamed1、Unnamed2、B_ColumnWithName、Unnamed3、Unnamed4

I need to set the names of the currently unnamed columns to the same as the 1st column name from the left, so my columns should look like this:我需要将当前未命名列的名称设置为与左侧第一个列名称相同,因此我的列应如下所示:

A_ColumnWithName, A_ColumnWithName, A_ColumnWithName, B_ColumnWithName, B_ColumnWithName, B_ColumnWithName

Any hints how can I do it using Python?任何提示我如何使用 Python 来做到这一点? An important thing is that there is tons of such columns that's why it's required to do in the most automatic way possible.重要的是,有大量这样的列,这就是为什么需要以尽可能最自动化的方式进行操作的原因。

You can convert the index to Series and use it to mask the names that contain "Unnamed" and ffill the previous valid name:您可以将索引转换为 Series 并使用它来mask包含“未命名”的名称并ffill以前的有效名称:

cols = df.columns.to_series()
df.columns = cols.mask(cols.str.contains('Unnamed')).ffill()

Note however that having duplicated columns names is not encouraged但是请注意,鼓励使用重复的列名

example input:示例输入:

   A Unnamed1 Unnamed2  B Unnamed3  C
0  x        x        x  x        x  x

output: output:

   A  A  A  B  B  C
0  x  x  x  x  x  x

The following code would work.以下代码将起作用。

import pandas as pd
df = pd.DataFrame( columns= ["A_ColumnWithName", "unnamed","unnamed", "B_ColumnWithName", "unnamed", "unnamed"])

replaceWith = df.columns.values[0]
for i in range(1, len(df.columns)):
    if df.columns[i] == 'unnamed':
        df.columns.values[i] = replaceWith
    else:
        replaceWith = df.columns.values[i]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM