Have been trying to change column names in a Pandas DataFrame.
In [1]: import pandas as pd
...: df=pd.read_csv('winequality-red.csv',sep=';')
...: df.head()
Out[1]:
fixed acidity volatile acidity citric acid residual sugar chlorides \
0 7.4 0.70 0.00 1.9 0.076
1 7.8 0.88 0.00 2.6 0.098
2 7.8 0.76 0.04 2.3 0.092
3 11.2 0.28 0.56 1.9 0.075
4 7.4 0.70 0.00 1.9 0.076
free sulfur dioxide total sulfur dioxide density pH sulphates \
0 11.0 34.0 0.9978 3.51 0.56
1 25.0 67.0 0.9968 3.20 0.68
2 15.0 54.0 0.9970 3.26 0.65
3 17.0 60.0 0.9980 3.16 0.58
4 11.0 34.0 0.9978 3.51 0.56
alcohol quality
0 9.4 5
1 9.8 5
2 9.8 5
3 9.8 6
4 9.4 5
While using the replace method inside rename, received the following error message:
In [2]: df.rename(str.replace(' ', '_'),axis='columns',inplace=True)
...: df.head()
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-2-3bf0bb1685ed> in <module>
----> 1 df.rename(str.replace(' ', '_'),axis='columns',inplace=True)
2 df.head()
**TypeError: replace() takes at least 2 arguments (1 given)**
However, if changing str.replace for str.title, str.upper or str.lower, it works.
In [3]: df.rename(str.title,axis='columns',inplace=True)
...: df.head()
Out[3]:
Fixed Acidity Volatile Acidity Citric Acid Residual Sugar Chlorides \
0 7.4 0.70 0.00 1.9 0.076
1 7.8 0.88 0.00 2.6 0.098
2 7.8 0.76 0.04 2.3 0.092
3 11.2 0.28 0.56 1.9 0.075
4 7.4 0.70 0.00 1.9 0.076
Free Sulfur Dioxide Total Sulfur Dioxide Density Ph Sulphates \
0 11.0 34.0 0.9978 3.51 0.56
1 25.0 67.0 0.9968 3.20 0.68
2 15.0 54.0 0.9970 3.26 0.65
3 17.0 60.0 0.9980 3.16 0.58
4 11.0 34.0 0.9978 3.51 0.56
Alcohol Quality
0 9.4 5
1 9.8 5
2 9.8 5
3 9.8 6
4 9.4 5
In [4]: df.rename(str.upper,axis='columns',inplace=True)
...: df.head()
Out[4]:
FIXED ACIDITY VOLATILE ACIDITY CITRIC ACID RESIDUAL SUGAR CHLORIDES \
0 7.4 0.70 0.00 1.9 0.076
1 7.8 0.88 0.00 2.6 0.098
2 7.8 0.76 0.04 2.3 0.092
3 11.2 0.28 0.56 1.9 0.075
4 7.4 0.70 0.00 1.9 0.076
FREE SULFUR DIOXIDE TOTAL SULFUR DIOXIDE DENSITY PH SULPHATES \
0 11.0 34.0 0.9978 3.51 0.56
1 25.0 67.0 0.9968 3.20 0.68
2 15.0 54.0 0.9970 3.26 0.65
3 17.0 60.0 0.9980 3.16 0.58
4 11.0 34.0 0.9978 3.51 0.56
ALCOHOL QUALITY
0 9.4 5
1 9.8 5
2 9.8 5
3 9.8 6
4 9.4 5
In [5]: df.rename(str.lower,axis='columns',inplace=True)
...: df.head()
Out[5]:
fixed acidity volatile acidity citric acid residual sugar chlorides \
0 7.4 0.70 0.00 1.9 0.076
1 7.8 0.88 0.00 2.6 0.098
2 7.8 0.76 0.04 2.3 0.092
3 11.2 0.28 0.56 1.9 0.075
4 7.4 0.70 0.00 1.9 0.076
free sulfur dioxide total sulfur dioxide density ph sulphates \
0 11.0 34.0 0.9978 3.51 0.56
1 25.0 67.0 0.9968 3.20 0.68
2 15.0 54.0 0.9970 3.26 0.65
3 17.0 60.0 0.9980 3.16 0.58
4 11.0 34.0 0.9978 3.51 0.56
alcohol quality
0 9.4 5
1 9.8 5
2 9.8 5
3 9.8 6
4 9.4 5
Any clues why the "TypeError: replace() takes at least 2 arguments (1 given)" when using str.replace?
PS By the way, I managed to solve using list comprehensions
In [6]: df.columns=[col.replace(' ', '_') for col in df.columns]
...: df.head()
Out[6]:
fixed_acidity volatile_acidity citric_acid residual_sugar chlorides \
0 7.4 0.70 0.00 1.9 0.076
1 7.8 0.88 0.00 2.6 0.098
2 7.8 0.76 0.04 2.3 0.092
3 11.2 0.28 0.56 1.9 0.075
4 7.4 0.70 0.00 1.9 0.076
free_sulfur_dioxide total_sulfur_dioxide density ph sulphates \
0 11.0 34.0 0.9978 3.51 0.56
1 25.0 67.0 0.9968 3.20 0.68
2 15.0 54.0 0.9970 3.26 0.65
3 17.0 60.0 0.9980 3.16 0.58
4 11.0 34.0 0.9978 3.51 0.56
alcohol quality
0 9.4 5
1 9.8 5
2 9.8 5
3 9.8 6
4 9.4 5
But I'm still confused why the replace method is returning a TypeError message when used inside pd.rename.
Did you try this:
df.columns = df.columns.str.replace(' ', '_')
Inside rename you can do a lambda:
df.rename(columns=lambda x: x.replace(' ', '_'), inplace=True)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.