简体   繁体   中英

pandas add index to column name

I have data frame with about 100 columns that repeat itself because the data is organized by weeks, it looks something like that:

hours hours clicks clicks days days minutes minutes
week 1 week 2 week 1 week 2 week 1 week 2 week 1 week 2
2 2 2 3 6 2 2 3
1 7 6 3 8 2 9 3

I would like the output to look like this:

hours_w1 hours_w2 clicks_w1 clicks_w2 days_w1 days_w2 minutes_w1 minutes_w2
2 2 2 3 6 2 2 3
1 7 6 3 8 2 9 3

I know I can just rename the columns but because I have over 100 columns I'm looking for a more efficient way.

I tried to use add_suffix but had only managed to add the same suffix to all columns, when what I need is a different index for each week.

any idea how to do this?

Thanks!!

Extract the suffixes from the first row then add them to the column names and finally remove the first row.

# To fix mangle_dup_cols
df.columns = df.columns.str.split('.').str[0]
suffixes = '_' + df.iloc[0].str[0] + df.iloc[0].str[-1]
df.columns += suffixes
df = df.iloc[1:]

Output:

>>> df
  hours_w1 hours_w2 clicks_w1 clicks_w2 days_w1 days_w2 minutes_w1 minutes_w2
1        2        2         2         3       6       2          2          3
2        1        7         6         3       8       2          9          3

first you should change the first row:

df.iloc[0] = df.iloc[0].apply(lambda x:'w1' if x == 'week 1' else 'w2')

Then you can merge it with the column name like this:

df.columns = [i + '_' + j for i, j in zip(df.columns, df.iloc[0])]

And then you can remove the first row:

df = df.iloc[1:]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM