How to get certain text from html tag on python?

Question

I'm making a Python md5 decryptor from an API, but the problem is the API is sending back an HTML feedback. How do I get the text between the <font color=green> ?

{"error":0,"msg":"<font color=blue><b>Live</b></font><font color=green>Jumpman#23</font> | [MD5 Decrypt] .S/C0D3"}

Answer 1

I suggest using an HTML parser as Beautiful Soup :

>>> from bs4 import BeautifulSoup
>>> d = {"error":0,"msg":"<font color=blue><b>Live</b></font><font color=green>Jumpman#23</font> | [MD5 Decrypt] .S/C0D3"}
>>> soup = BeautifulSoup(d['msg'], 'html.parser')
>>> soup.font.attrs
{'color': 'blue'}

You will get a dict that contains key, value pars as attribute name, value.

Update

To get the text "Jumpman#23"

>>> soup.findAll("font", {"color": "green"})[0].contents[0]
'Jumpman#23'

Answer 2

If you know the target text will be exactly <font color=green> , then you can use simple string operations:

msg = "<font color=blue><b>Live</b></font><font color=green>Jumpman#23</font> | [MD5 Decrypt] .S/C0D3"
start_pattern = "<font color=green>"
stop_pattern = "<"
start_index = msg.find(start_pattern) + len(start_pattern)
stop_index = start_index + msg[start_index:].find(stop_pattern)
print msg[start_index:stop_index]

Answer 3

You could use bs4 and an adjacent sibling combinator for font tag

from bs4 import BeautifulSoup as bs
s = {"error":0,"msg":"<font color=blue><b>Live</b></font><font color=green>Jumpman#23</font> | [MD5 Decrypt] .S/C0D3"}
soup = bs(s['msg'], 'lxml')
data =  soup.select_one('font + font').text
print(data)

How to get certain text from html tag on python?

Question

3 answers

solution1
2 2019-04-17 15:58:55

Update

solution2
0 2019-04-17 16:09:48

solution3
0 2019-04-17 16:16:21

How to get certain text from html tag on python?

Question

3 answers

solution1 2 2019-04-17 15:58:55

Update

solution2 0 2019-04-17 16:09:48

solution3 0 2019-04-17 16:16:21

solution1
2 2019-04-17 15:58:55

solution2
0 2019-04-17 16:09:48

solution3
0 2019-04-17 16:16:21