I've imported an HTML table from Basketball Reference using pandas, but I'm running into an annoyance trying to rename a couple of columns that have empty strings for their name.
Here's the code to pull the table:
tables = pd.read_html('http://www.basketball-reference.com/leagues/NBA_2016_games.html')
games = tables[0]
The columns look like this:
Out[138]:
Index([u'Date', u'Start (ET)', u'Visitor/Neutral', u'PTS', u'Home/Neutral',
u'PTS.1', u' ', u' .1', u'Notes'],
dtype='object')
Renaming everything except for the u' '
and u' .1'
columns is no issue, but I cannot find the right way to rename the empty ones using a label approach.
I tried this by default (limited to renaming only a few columns here):
column_names = {'Date': 'date', ' ': 'box', ' .1': 'overtime'}
games.rename(columns = column_names)
but this leaves the ' '
and ' .1'
columns unchanged.
This method works:
column_names = {games.columns[6]: 'box', games.columns[7]: 'overtime'}
But is there any way to change these names without explicitly referencing the position?
也许这可能是一个快速修复 - 明确设置列名称。
df.columns = [u'Date', u'Start (ET)', u'Visitor/Neutral', u'PTS', u'Home/Neutral', u'PTS.1', u'Rename1', u'Rename2', u'Notes']
For me works add str.strip
for remove trailing whitespaces, also is necessary change dict
values (remove whitespaces):
column_names = {'Date': 'date', '': 'box', '.1': 'overtime'}
games.columns = games.columns.str.strip()
games = games.rename(columns = column_names)
print (games.columns)
Index(['date', 'Start (ET)', 'Visitor/Neutral', 'PTS', 'Home/Neutral', 'PTS.1',
'box', 'overtime', 'Notes'],
dtype='object')
Another solution is export column names to list
and there is \\xa
( NO-BREAK SPACE ):
print (games.columns.tolist())
['Date', 'Start (ET)', 'Visitor/Neutral', 'PTS', 'Home/Neutral',
'PTS.1', '\xa0', '\xa0.1', 'Notes']
column_names = {'Date': 'date', '\xa0': 'box', '\xa0.1': 'overtime'}
games = games.rename(columns = column_names)
print (games.columns)
Index(['date', 'Start (ET)', 'Visitor/Neutral', 'PTS', 'Home/Neutral', 'PTS.1',
'box', 'overtime', 'Notes'],
dtype='object')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.