How can I use an alias/look-up table to refer to dataframe columns

Question

I wish to use a look-up or alias table to rename columns for presentation in graphs etc.

I have a dataframe that I'm loading in from a large CSV file.
I drop the first few rows then correct the column names. (I know this is messy - suggestions appreciated).
However, I don't want the (somewhat programmatic and ugly) column names from appearing in plots etc.
So I have a text file which gives the column names, alias names (and units if I need them).
I then want to replace the column names with the corresponding alias names.
Alternatively, the original column names could be preserved; as long as I can present the alias names when plotting.

My original CSV looks like this:

"TOA5","HE101_RV50_GAF","CR","7225","CR.Std.07","CPU:aa.CR6"
"TIMESTAMP","RECORD","SensorRelEventMin(1)","SensorRelEventMin(2)","SensorRelEventMin(3)","SensorRelEventMin(4)"
"TS","RN","","","",""
"","","Smp","Smp","Smp","Smp"
"2019-08-30 00:22:22.9",14546,-0.4051819,-0.2565842,-9.702911,-0.5374413
"2019-08-30 00:27:34.24",14547,-1.118546,-0.9480438,-5.356552,-1.204945
"2019-08-30 00:29:47.86",14548,-0.765564,-0.5029907,-7.062241,-0.8703575
"2019-08-30 00:35:36.76",14549,-0.7200012,-0.6559029,-6.257889,-0.6656723
"2019-08-30 00:42:28.56",14550,-0.6325226,-0.4022942,-4.179138,-0.4609756
"2019-08-30 00:48:55.32",14551,-0.4613953,-0.2666397,-4.391235,-0.4144287
"2019-08-30 00:52:15.74",14552,-0.4507446,-0.3086662,-1.869171,-0.5024986
"2019-08-30 01:02:15.04",14553,-0.5307922,-0.3815041,-5.40918,-0.3242683
"2019-08-30 01:09:18.38",14554,-0.6351166,-0.5765362,-2.261734,-0.4456367
"2019-08-30 01:11:07.38",14555,-0.2823181,-0.2864227,-0.2417603,-0.3462906
"2019-08-30 01:13:07.6",14556,-0.3824463,-0.3220673,-7.051376,-0.4786491

And I've written an alias table like this:

"EntryName","AliasedName","Units"
"TIMESTAMP","time","s"
"RECORD","record number",""
"SensorRelEventMin(1)","1st Sensor Name","uS"
"SensorRelEventMin(2)","2nd Sensor Name","uS"
"SensorRelEventMin(3)","3rd Sensor Name","uS"
"SensorRelEventMin(4)","4th Sensor Name","uS"

And I would like the df to look like this:

"time","record number","1st Sensor Name","2nd Sensor Name","3rd Sensor Name","4th Sensor Name"
"2019-08-30 00:22:22.9",14546,-0.4051819,-0.2565842,-9.702911,-0.5374413
"2019-08-30 00:27:34.24",14547,-1.118546,-0.9480438,-5.356552,-1.204945
"2019-08-30 00:29:47.86",14548,-0.765564,-0.5029907,-7.062241,-0.8703575
...

My code for loading is:

# load data into df
df=pd.read_csv(filename, skiprows=3, na_values='NAN')
df.columns=["TIMESTAMP","RECORD","SensorRelEventMin(1)","SensorRelEventMin(2)","SensorRelEventMin(3)","SensorRelEventMin(4)"]
df=df.astype({'TIMESTAMP': 'datetime64'})
# read alias table
aliasTable = pd.read_csv(aliasTable.txt)

I would like to do something that looks like this in pseudocode:

df.rename({aliasTable["EntryName"]:aliasTable["AliasedName"]})

Alternatively, if it makes more sense to preserve the column names, a simple way to replace any figure titles with the AliasedName would also work. I know this is a pretty vague request, but I'm at the end of my python capabilities!

Answer 1

If I have understood the question correctly, we want a simple solution to rename the dataframe columns. If that is indeed the case, the following approach might help:

Create a dictionary that maps old column names to the new column names.
Use rename method on the dataframe to change the column names

Example:

import pandas as pd
import numpy as np

# define a dictionary to replace column names in keys with column names in values
col_dict = {
    0: "col1",
    1: "col2"   
}

# create a dataframe. 0 and 1 are the default column names
df = pd.DataFrame(np.random.rand(4,2))

# print df
df

    0           1
0   0.433529    0.812580
1   0.116504    0.801236
2   0.236852    0.336812
3   0.415137    0.708668

# apply rename function over df 
df.rename(columns=col_dict, inplace=True)

# print df
df

    col1        col2
0   0.824290    0.306156
1   0.468152    0.809643
2   0.082632    0.114923
3   0.762481    0.360541

Hope this helps.

How can I use an alias/look-up table to refer to dataframe columns

Question

1 answers

solution1
1 ACCPTED 2020-07-03 05:42:18

How can I use an alias/look-up table to refer to dataframe columns

Question

1 answers

solution1 1 ACCPTED 2020-07-03 05:42:18

solution1
1 ACCPTED 2020-07-03 05:42:18