[英]If statement to remove and replace a string
I'm trying to remove parts of a string that make it a strong so that it can become an integer.我试图删除一个字符串的一部分,使其变强,以便它可以成为一个整数。 Although, I also need to take into account the changes in the string.
虽然,我还需要考虑字符串的变化。
I've tried to put this into a function;我试图把它放到一个函数中; here's what I have done:
这是我所做的:
import numpy as np
def rem(x):
data = []
for i in x:
if "m" in i:
data.append(i.replace(".00m", '000000'))
elif "Th" in i:
data.append(i.replace("Th.", '000'))
return data
data_array = np.array(['£67.50m', '£63.00m', '£49.50m','£90Th.', '£720Th.'], dtype=object)
rem(data_array)
>['£67.50m', '£63000000', '£49.50m', '£90000', '£720000']
How would I take into account that before m
I'll also have numbers from 0-9?我将如何考虑在
m
之前我也会有 0-9 的数字?
I have tried this in my bigger dataframe but I get the following error:我在更大的数据框中尝试过这个,但出现以下错误:
TypeError: argument of type 'float' is not iterable
类型错误:“float”类型的参数不可迭代
Which I'm assuming it's because the function does not take into account .50m, .20m ...
?我假设这是因为该功能没有考虑
.50m, .20m ...
?
Using @Ptit Xav suggestion:使用@Ptit Xav 建议:
def rem(x):
data = []
for i in x:
if "m" in i:
xi = re.sub("[^\d]", "", i)
data.append(int(xi)*10000)
elif "Th" in i:
hi = re.sub("[^\d]", "", i)
data.append(int(hi)*1000)
return data
You can use the substitution method sub
in the package re
:您可以使用包
re
的替换方法sub
:
import numpy as np
import re
def rem(x):
data = []
for i in x:
if "m" in i:
data.append(re.sub("(\.\d+m)", '000000', i))
elif "Th" in i:
data.append(i.replace("Th.", '000'))
return data
I replaced this code:我替换了这个代码:
data.append(i.replace(".00m", '000000'))
With:和:
data.append(i.split(".")[0] + "000000")
The output code is:输出代码为:
>['£67000000', '£63000000', '£49000000', '£90000', '£720000']
With conversion :随着转换:
if "m" in i:
xi = re.sub("[^\d.]", "", i)
data.append("{}{:.0f}".format(i[0],float(xi)*1000000))
elif "Th" in i:
hi = re.sub("[^\d.]", "", i)
data.append("{}{:.0f}".format(i[0],float(hi)*1000))
I think that you can make it a little more robust replacing if "m" in i:
and elif "Th" in i:
with regular expressions.我认为你可以用正则表达式替换
if "m" in i:
和elif "Th" in i:
使它更健壮一些。
import re
import warnings
import numpy as np
RE_ENDS_M = re.compile('\.(\d{2})m$')
RE_ENDS_TH = re.compile('Th\.$')
def rem(x):
data = []
for i in x:
if RE_ENDS_M.search(i):
data.append(re.sub(RE_ENDS_M, "\g<1>0000", i))
elif RE_ENDS_TH.search(i):
data.append(re.sub(RE_ENDS_TH, '000', i))
else:
warnings.warn("Ignoring data: %s" % i)
return data
data_array = np.array(
['£67.50m', '£63.00m', '£49.50m','£90Th.', '£720Th.', '1€50'],
dtype=object
)
print(rem(data_array))
# Outputs:
# UserWarning: Ignoring data 1€50
# warnings.warn("Ignoring data %s" % i)
# ['£67500000', '£63000000', '£49500000', '£90000', '£720000']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.