Renaming columns in Pandas Data Frame by rename method

Question

Have been trying to change column names in a Pandas DataFrame.

In [1]: import pandas as pd
   ...: df=pd.read_csv('winequality-red.csv',sep=';')
   ...: df.head()
Out[1]: 
   fixed acidity  volatile acidity  citric acid  residual sugar  chlorides  \
0            7.4              0.70         0.00             1.9      0.076   
1            7.8              0.88         0.00             2.6      0.098   
2            7.8              0.76         0.04             2.3      0.092   
3           11.2              0.28         0.56             1.9      0.075   
4            7.4              0.70         0.00             1.9      0.076   

   free sulfur dioxide  total sulfur dioxide  density    pH  sulphates  \
0                 11.0                  34.0   0.9978  3.51       0.56   
1                 25.0                  67.0   0.9968  3.20       0.68   
2                 15.0                  54.0   0.9970  3.26       0.65   
3                 17.0                  60.0   0.9980  3.16       0.58   
4                 11.0                  34.0   0.9978  3.51       0.56   

   alcohol  quality  
0      9.4        5  
1      9.8        5  
2      9.8        5  
3      9.8        6  
4      9.4        5

While using the replace method inside rename, received the following error message:

In [2]: df.rename(str.replace(' ', '_'),axis='columns',inplace=True)
   ...: df.head()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-3bf0bb1685ed> in <module>
----> 1 df.rename(str.replace(' ', '_'),axis='columns',inplace=True)
      2 df.head()

**TypeError: replace() takes at least 2 arguments (1 given)**

However, if changing str.replace for str.title, str.upper or str.lower, it works.

In [3]: df.rename(str.title,axis='columns',inplace=True)
   ...: df.head()
Out[3]: 
   Fixed Acidity  Volatile Acidity  Citric Acid  Residual Sugar  Chlorides  \
0            7.4              0.70         0.00             1.9      0.076   
1            7.8              0.88         0.00             2.6      0.098   
2            7.8              0.76         0.04             2.3      0.092   
3           11.2              0.28         0.56             1.9      0.075   
4            7.4              0.70         0.00             1.9      0.076   

   Free Sulfur Dioxide  Total Sulfur Dioxide  Density    Ph  Sulphates  \
0                 11.0                  34.0   0.9978  3.51       0.56   
1                 25.0                  67.0   0.9968  3.20       0.68   
2                 15.0                  54.0   0.9970  3.26       0.65   
3                 17.0                  60.0   0.9980  3.16       0.58   
4                 11.0                  34.0   0.9978  3.51       0.56   

   Alcohol  Quality  
0      9.4        5  
1      9.8        5  
2      9.8        5  
3      9.8        6  
4      9.4        5  

In [4]: df.rename(str.upper,axis='columns',inplace=True)
   ...: df.head()
Out[4]: 
   FIXED ACIDITY  VOLATILE ACIDITY  CITRIC ACID  RESIDUAL SUGAR  CHLORIDES  \
0            7.4              0.70         0.00             1.9      0.076   
1            7.8              0.88         0.00             2.6      0.098   
2            7.8              0.76         0.04             2.3      0.092   
3           11.2              0.28         0.56             1.9      0.075   
4            7.4              0.70         0.00             1.9      0.076   

   FREE SULFUR DIOXIDE  TOTAL SULFUR DIOXIDE  DENSITY    PH  SULPHATES  \
0                 11.0                  34.0   0.9978  3.51       0.56   
1                 25.0                  67.0   0.9968  3.20       0.68   
2                 15.0                  54.0   0.9970  3.26       0.65   
3                 17.0                  60.0   0.9980  3.16       0.58   
4                 11.0                  34.0   0.9978  3.51       0.56   

   ALCOHOL  QUALITY  
0      9.4        5  
1      9.8        5  
2      9.8        5  
3      9.8        6  
4      9.4        5  

In [5]: df.rename(str.lower,axis='columns',inplace=True)
   ...: df.head()
Out[5]: 
   fixed acidity  volatile acidity  citric acid  residual sugar  chlorides  \
0            7.4              0.70         0.00             1.9      0.076   
1            7.8              0.88         0.00             2.6      0.098   
2            7.8              0.76         0.04             2.3      0.092   
3           11.2              0.28         0.56             1.9      0.075   
4            7.4              0.70         0.00             1.9      0.076   

   free sulfur dioxide  total sulfur dioxide  density    ph  sulphates  \
0                 11.0                  34.0   0.9978  3.51       0.56   
1                 25.0                  67.0   0.9968  3.20       0.68   
2                 15.0                  54.0   0.9970  3.26       0.65   
3                 17.0                  60.0   0.9980  3.16       0.58   
4                 11.0                  34.0   0.9978  3.51       0.56   

   alcohol  quality  
0      9.4        5  
1      9.8        5  
2      9.8        5  
3      9.8        6  
4      9.4        5

Any clues why the "TypeError: replace() takes at least 2 arguments (1 given)" when using str.replace?

PS By the way, I managed to solve using list comprehensions

In [6]: df.columns=[col.replace(' ', '_') for col in df.columns]
   ...: df.head()
Out[6]: 
   fixed_acidity  volatile_acidity  citric_acid  residual_sugar  chlorides  \
0            7.4              0.70         0.00             1.9      0.076   
1            7.8              0.88         0.00             2.6      0.098   
2            7.8              0.76         0.04             2.3      0.092   
3           11.2              0.28         0.56             1.9      0.075   
4            7.4              0.70         0.00             1.9      0.076   

   free_sulfur_dioxide  total_sulfur_dioxide  density    ph  sulphates  \
0                 11.0                  34.0   0.9978  3.51       0.56   
1                 25.0                  67.0   0.9968  3.20       0.68   
2                 15.0                  54.0   0.9970  3.26       0.65   
3                 17.0                  60.0   0.9980  3.16       0.58   
4                 11.0                  34.0   0.9978  3.51       0.56   

   alcohol  quality  
0      9.4        5  
1      9.8        5  
2      9.8        5  
3      9.8        6  
4      9.4        5

But I'm still confused why the replace method is returning a TypeError message when used inside pd.rename.

Answer 1

Did you try this:

df.columns = df.columns.str.replace(' ', '_')

Inside rename you can do a lambda:

df.rename(columns=lambda x: x.replace(' ', '_'), inplace=True)

Renaming columns in Pandas Data Frame by rename method

Question

1 answers

solution1
0 2020-05-28 17:26:48

Renaming columns in Pandas Data Frame by rename method

Question

1 answers

solution1 0 2020-05-28 17:26:48

solution1
0 2020-05-28 17:26:48