简体   繁体   English

如何使用bs4在字符串“ Results for 27th July 2019”中剥离“ Results for”?

[英]How do I strip off “Results for ” in string “Results for 27th July 2019” using bs4?

I need to strip off the "Results for " text to later format it to a specific dateformat. 我需要剥离“ Results for”文本,以便以后将其格式化为特定的dateformat。

Problem is 问题是

When I run the code without .strip, I get: 当我运行不带.strip的代码时,我得到:

'Results for 27th July 2019'

When I am trying to strip off the text, I get this error: 当我尝试剥离文本时,出现以下错误:

TypeError: a bytes-like object is required, not 'str'

python3: python3:

date = res.parent.find("span", {"class": "standard-headline"}).text.encode('utf8').strip("Results for ")
TypeError: a bytes-like object is required, not 'str'

Is there a workaround? 有解决方法吗? I've been looking into regex, but doesn't seem to solve my problem when there is no separator present. 我一直在研究正则表达式,但是当没有分隔符时,似乎无法解决我的问题。

Best regards 最好的祝福

The error it's because encode('utf8') return bytes . 错误的原因是encode('utf8')返回bytes You need to decode('utf-8') . 您需要decode('utf-8') It return a str() that you can strip. 它返回一个可以剥离的str()

After encode('utf-8') you get binary string, so it expects also binary string (list of chars, to be more exact) as param. 在encode('utf-8')之后,您将获得二进制字符串,因此它也希望将二进制字符串(更精确的字符列表)作为参数。 You can use either 您可以使用

text.encode('utf-8').decode().strip("Results for ")

or 要么

text.encode('utf-8').strip(b"Results for ")

Bear in mind, strip is not the best choice to remove particular text from the head of the string, because this also strips all R's, e's, s's, whitespaces and so on from the tail. 请记住,剥离不是从字符串开头删除特定文本的最佳选择,因为这还会从尾部剥离所有R,e,s,空格等。

I think the replace method is what you need. 我认为您需要的是替换方法。 Just replace Results for with an empty string. 只需将Results for替换Results for空字符串即可。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM