[英]Replace method not removing string from pandas dataframe column
Hi I have a pandas dataframe column which I need to set as numeric.嗨,我有一个 pandas dataframe 列,我需要将其设置为数字。
First I need to remove the 'M' (for millions) from the data.首先,我需要从数据中删除“M”(数百万)。 Then I can use to_numeric function.
然后我可以使用 to_numeric function。 But the end result seems to just be a series of NaN's.
但最终结果似乎只是一系列 NaN。 Looking further into it, the numeric method isn't working because the column still contains an 'M" - hence the replace method isn't working.
进一步研究,数字方法不起作用,因为该列仍然包含“M” - 因此替换方法不起作用。
Why is the replace method not removing the 'M'?为什么替换方法没有删除“M”?
#!/usr/local/bin/python3
import requests
import pandas as pd
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:74.0) Gecko/20100101 Firefox/74.0'}
url = 'https://www.sharesoutstandinghistory.com/ivv/'
r = requests.get(url, headers=headers)
df = pd.read_html(r.content, header =0)[1]
df.columns = ['Date', 'Value'] # set column names
print(df)
df['Value'].replace('M', '', inplace=True) # replace M
df['Value'] = pd.to_numeric(df['Value'], errors='coerce') # set to numeric
print(df)
Here is what I get:这是我得到的:
Date Value
0 1/6/2010 194.70M
1 1/11/2010 194.45M
2 1/19/2010 193.85M
3 1/21/2010 193.70M
4 1/25/2010 192.90M
... ... ...
1049 3/9/2020 652.75M
1050 3/16/2020 654.45M
1051 3/23/2020 627.00M
1052 4/6/2020 631.45M
1053 4/13/2020 633.05M
[1054 rows x 2 columns]
Date Value
0 1/6/2010 NaN
1 1/11/2010 NaN
2 1/19/2010 NaN
3 1/21/2010 NaN
4 1/25/2010 NaN
... ... ...
1049 3/9/2020 NaN
1050 3/16/2020 NaN
1051 3/23/2020 NaN
1052 4/6/2020 NaN
1053 4/13/2020 NaN
Maybe you can try another way by using this df.Value=df.Value.str[:-1]
to remove the M.也许您可以尝试另一种方法,使用此
df.Value=df.Value.str[:-1]
删除 M。
It not remove M
, because no regex=True
parameter which is necessary for substring replacement:它不会删除
M
,因为没有 substring 替换所需的regex=True
参数:
df['Value'] = pd.to_numeric(df['Value'].replace('M', '', regex=True) , errors='coerce')
I think inplace
is not good practice, check this and this .我认为
inplace
不是好习惯,请检查this和this 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.