简体   繁体   English

Python从字符串中提取3个整数

[英]Python extract 3 integers from string

from bs4 import BeautifulSoup
URL = "https://www.worldometers.info/coronavirus/"
r = requests.get(URL)
soup = BeautifulSoup(r.content, 'html5lib')
countHTML = soup.find('div', attrs = {'class':'content-inner'})

for countVar in countHTML.findAll('div', attrs = {'class':'maincounter-number'}):
    count = countVar.span

Right now variable count returns:现在变量count返回:

<span style="color:#aaa">270,069</span>
<span>11,271</span>
<span>90,603</span>

I need help on extracting 3 separate integers from this string, I have tried count[0] but this is not an array so it does not work.我需要帮助从这个字符串中提取 3 个单独的整数,我试过count[0]但这不是一个数组所以它不起作用。

String1 = "270,069"
String2 = "11,271"
String3 = "90,603"

Then converts into 3 integers by removing the comma然后通过删除逗号转换为 3 个整数

Int1 = 270069
Int2 = 11271
Int3 = 90603

Perhaps Regex will help?也许正则表达式会有所帮助?

Edit:编辑:

I currently have numbers = [] as one value in a list, such as我目前将numbers = []作为列表中的一个值,例如

numbers = """
270069
11271
90603"""

so if I do numbers[0], all 3 integers will show up as 1 value, how do I strip new lines, and make them into a list or array with 3 separate values?因此,如果我执行数字 [0],所有 3 个整数都将显示为 1 个值,我如何去除新行,并将它们变成具有 3 个单独值的列表或数组?

Yep, some simple Regex should work.是的,一些简单的正则表达式应该可以工作。

s = '''<span style="color:#aaa">270,069</span>
<span>11,271</span>
<span>90,603</span>'''

num_strs = re.findall('[0-9,]+', s)

numbers = [int(ns.replace(',', '')) for ns in num_strs]

# Extract to variables
num1, num2, num3 = numbers

you could usse:你可以使用:

my_numbers = []
for countVar in countHTML.findAll('div', attrs = {'class':'maincounter-number'}):
    my_numbers.append(int(countVar.span.text.strip().replace(',', '')))

print(my_numbers)

output:输出:

[270104, 11272, 90603]

You could use the split method as follows您可以使用 split 方法如下

intAsString = '123\n1234\n12345'
listOfInts = intAsString.split('\n')

Here, listOfInts would be ['123', '1234', '12345']在这里,listOfInts 将是['123', '1234', '12345']

In python, \\n is the new line character, so splitting by newline should give you the three numbers在python中,\\n是换行符,所以用换行符分割应该给你三个数字

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM