[英]TypeError: '<' not supported between instances of 'str' and 'int' after converting string to float
Using: Python in Google Collab Thanks in Advance: I have run this code on other data I have scraped FBREF, so I am unsure why it's happening now.在 Google Collab 中使用:Python 提前致谢:我已经在其他数据上运行了此代码,我已经抓取了 FBREF,所以我不确定为什么现在会发生这种情况。 The only difference is the way I scraped it.
唯一的区别是我刮它的方式。 The first time I scraped it:
url_link = 'https://fbref.com/en/comps/Big5/gca/players/Big-5-European-Leagues-Stats'
我第一次抓取它:
url_link = 'https://fbref.com/en/comps/Big5/gca/players/Big-5-European-Leagues-Stats'
The second time I scraped it:我第二次刮了它:
url = 'https://fbref.com/en/comps/22/stats/Major-League-Soccer-Stats'
html_content = requests.get(url).text.replace('<,--'. ''),replace('-->', '')
df = pd.read_html(html_content)
I then convert the data from object to float so I can do a calculation, after I have pulled it into my dataframe:然后,我将 object 中的数据转换为浮点数,以便在将其拉入 dataframe 后进行计算:
dfstandard['90s'] = dfstandard['90s'].astype(float)
dfstandard['Gls'] = dfstandard['Gls'].astype(float)
dfstandard['90s'] = dfstandard['90s'].astype(float)
dfstandard['Gls'] = dfstandard['Gls'].astype(float)
I look and it shows they are both floats:我看了看,它表明它们都是花车:
10 90s 743 non-null float64 10 90s 743 非空 float64
11 Gls 743 non-null float64 11 Gls 743 非空 float64
But when I run the code that as worked previously:但是当我运行以前工作的代码时:
dfstandard['Gls'] = dfstandard['Gls'] / dfstandard['90s']
I get the error message "TypeError: '<' not supported between instances of 'str' and 'int'"我收到错误消息“TypeError: '<' 在 'str' 和 'int' 的实例之间不支持”
I am fairly new to scraping, I'm stuck and don't know what to do next.我对抓取还很陌生,我被卡住了,不知道下一步该做什么。
The full error message is below:完整的错误消息如下:
<ipython-input-152-e0ab76715b7d> in <module>()
1 #turn data into p 90
----> 2 dfstandard['Gls'] = dfstandard['Gls'] / dfstandard['90s']
3 dfstandard['Ast'] = dfstandard['Ast'] / dfstandard['90s']
4 dfstandard['G-PK'] = dfstandard['G-PK'] / dfstandard['90s']
5 dfstandard['PK'] = dfstandard['PK'] / dfstandard['90s']
8 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in _outer_indexer(self, left, right)
261
262 def _outer_indexer(self, left, right):
--> 263 return libjoin.outer_join_indexer(left, right)```
264
265 _typ = "index"
pandas/_libs/join.pyx in pandas._libs.join.outer_join_indexer()
TypeError: '<' not supported between instances of 'str' and 'int'>
There are two Gls
columns in your dataframe. dataframe 中有两个
Gls
列。 I think you converted only one "Gls"
column to float, and when you do dfstandard['Gls'] = dfstandard['Gls'] / dfstandard['90s']
, the other "Gls" column is getting considered?...我认为您仅将一个
"Gls"
列转换为浮动,当您执行dfstandard['Gls'] = dfstandard['Gls'] / dfstandard['90s']
时,是否正在考虑另一个“Gls”列?.. .
Try stripping whitespace from the column names too尝试从列名中删除空格
df = df.rename(columns=lambda x: x.strip())
df['90s'] = pd.to_numeric(df['90s'], errors='coerce')
df['Gls'] = pd.to_numeric(df['Gls'], errors='coerce')
Thus the error.因此错误。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.