简体   繁体   中英

Can not rename the table's column name using pandas dataframe

I am new in jupyter notebook and python. Recently I'm working in this code but I can't find out the problem. I want to rename "Tesla Quarterly Revenue(Millions of US $)" and "Tesla Quarterly Revenue(Millions of US $).1" into "Data" and "Revenue" but it not changed. Here is my code:

!pip install pandas
!pip install requests
!pip install bs4
!pip install -U yfinance pandas 
!pip install plotly
!pip install html5lib
!pip install lxml
import yfinance as yf
import pandas as pd
import requests
from bs4 import BeautifulSoup
import plotly.graph_objects as go
from plotly.subplots import make_subplots
url = "https://www.macrotrends.net/stocks/charts/TSLA/tesla/revenue?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkPY0220ENSkillsNetwork23455606-2022-01-01"
html_data  = requests.get(url).text
soup = BeautifulSoup(html_data, 'html5lib')
tesla_revenue = pd.read_html(url, match = "Tesla Quarterly Revenue")[0]
tesla_revenue = tesla_revenue.rename(columns={"Tesla Quarterly Revenue(Millions of US $)":"Date","Tesla Quarterly Revenue(Millions of US $).1":"Revenue"})
tesla_revenue.head()

Here is the Output:

在此处输入图像描述

Could not reproduce the issue, it works as expected. May print your originally .columns and compare the values to your dict - Not sure if the source is interpreted differnt by module versions:

print(tesla_revenue.columns)

Just in case an alternative:

tesla_revenue.columns = ['Date','Revenue']

Example

import pandas as pd
url = "https://www.macrotrends.net/stocks/charts/TSLA/tesla/revenue?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkPY0220ENSkillsNetwork23455606-2022-01-01"
tesla_revenue = pd.read_html(url, match = "Tesla Quarterly Revenue")[0]
#tesla_revenue = tesla_revenue.rename(columns={"Tesla Quarterly Revenue(Millions of US $)":"Date","Tesla Quarterly Revenue(Millions of US $).1":"Revenue"})
tesla_revenue.columns = ['Date','Revenue']
tesla_revenue.head()

Output

Date Revenue
0 2022-09-30 $21,454
1 2022-06-30 $16,934
2 2022-03-31 $18,756
3 2021-12-31 $17,719
4 2021-09-30 $13,757

I could reproduce the error. You have a mistake in your column names. Instead of "Tesla Quarterly Revenue(Millions of US $)" it is "Tesla Quarterly Revenue (Millions of US $)" with a space between Revenue and the value in brackets. The same applies to the second column header.

To avoid this you could also save the column names into a variable like this:

soup = BeautifulSoup(html_data, 'html5lib')
tesla_revenue = pd.read_html(url, match="Tesla Quarterly Revenue")[0]
col1_name = tesla_revenue.columns[0]
col2_name = tesla_revenue.columns[1]
tesla_revenue = tesla_revenue.rename(columns={col1_name:"Date",col2_name:"Revenue"})
tesla_revenue.head()

This makes the code also a bit more readable:)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM