简体   繁体   中英

How can I use an alias/look-up table to refer to dataframe columns

I wish to use a look-up or alias table to rename columns for presentation in graphs etc.

  1. I have a dataframe that I'm loading in from a large CSV file.
  2. I drop the first few rows then correct the column names. (I know this is messy - suggestions appreciated).
  3. However, I don't want the (somewhat programmatic and ugly) column names from appearing in plots etc.
  4. So I have a text file which gives the column names, alias names (and units if I need them).
  5. I then want to replace the column names with the corresponding alias names.
  6. Alternatively, the original column names could be preserved; as long as I can present the alias names when plotting.

My original CSV looks like this:

"TOA5","HE101_RV50_GAF","CR","7225","CR.Std.07","CPU:aa.CR6"
"TIMESTAMP","RECORD","SensorRelEventMin(1)","SensorRelEventMin(2)","SensorRelEventMin(3)","SensorRelEventMin(4)"
"TS","RN","","","",""
"","","Smp","Smp","Smp","Smp"
"2019-08-30 00:22:22.9",14546,-0.4051819,-0.2565842,-9.702911,-0.5374413
"2019-08-30 00:27:34.24",14547,-1.118546,-0.9480438,-5.356552,-1.204945
"2019-08-30 00:29:47.86",14548,-0.765564,-0.5029907,-7.062241,-0.8703575
"2019-08-30 00:35:36.76",14549,-0.7200012,-0.6559029,-6.257889,-0.6656723
"2019-08-30 00:42:28.56",14550,-0.6325226,-0.4022942,-4.179138,-0.4609756
"2019-08-30 00:48:55.32",14551,-0.4613953,-0.2666397,-4.391235,-0.4144287
"2019-08-30 00:52:15.74",14552,-0.4507446,-0.3086662,-1.869171,-0.5024986
"2019-08-30 01:02:15.04",14553,-0.5307922,-0.3815041,-5.40918,-0.3242683
"2019-08-30 01:09:18.38",14554,-0.6351166,-0.5765362,-2.261734,-0.4456367
"2019-08-30 01:11:07.38",14555,-0.2823181,-0.2864227,-0.2417603,-0.3462906
"2019-08-30 01:13:07.6",14556,-0.3824463,-0.3220673,-7.051376,-0.4786491

And I've written an alias table like this:

"EntryName","AliasedName","Units"
"TIMESTAMP","time","s"
"RECORD","record number",""
"SensorRelEventMin(1)","1st Sensor Name","uS"
"SensorRelEventMin(2)","2nd Sensor Name","uS"
"SensorRelEventMin(3)","3rd Sensor Name","uS"
"SensorRelEventMin(4)","4th Sensor Name","uS"

And I would like the df to look like this:

"time","record number","1st Sensor Name","2nd Sensor Name","3rd Sensor Name","4th Sensor Name"
"2019-08-30 00:22:22.9",14546,-0.4051819,-0.2565842,-9.702911,-0.5374413
"2019-08-30 00:27:34.24",14547,-1.118546,-0.9480438,-5.356552,-1.204945
"2019-08-30 00:29:47.86",14548,-0.765564,-0.5029907,-7.062241,-0.8703575
...

My code for loading is:

# load data into df
df=pd.read_csv(filename, skiprows=3, na_values='NAN')
df.columns=["TIMESTAMP","RECORD","SensorRelEventMin(1)","SensorRelEventMin(2)","SensorRelEventMin(3)","SensorRelEventMin(4)"]
df=df.astype({'TIMESTAMP': 'datetime64'})
# read alias table
aliasTable = pd.read_csv(aliasTable.txt)

I would like to do something that looks like this in pseudocode:

df.rename({aliasTable["EntryName"]:aliasTable["AliasedName"]})

Alternatively, if it makes more sense to preserve the column names, a simple way to replace any figure titles with the AliasedName would also work. I know this is a pretty vague request, but I'm at the end of my python capabilities!

If I have understood the question correctly, we want a simple solution to rename the dataframe columns. If that is indeed the case, the following approach might help:

  1. Create a dictionary that maps old column names to the new column names.
  2. Use rename method on the dataframe to change the column names

Example:

import pandas as pd
import numpy as np

# define a dictionary to replace column names in keys with column names in values
col_dict = {
    0: "col1",
    1: "col2"   
}

# create a dataframe. 0 and 1 are the default column names
df = pd.DataFrame(np.random.rand(4,2))

# print df
df

    0           1
0   0.433529    0.812580
1   0.116504    0.801236
2   0.236852    0.336812
3   0.415137    0.708668

# apply rename function over df 
df.rename(columns=col_dict, inplace=True)

# print df
df

    col1        col2
0   0.824290    0.306156
1   0.468152    0.809643
2   0.082632    0.114923
3   0.762481    0.360541

Hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM