简体   繁体   中英

How can I get python to print the output of this program in an HTML file?

The program below scrapes the text of a web page. How could I get print(output) to display in an HTML file so it could be loaded in a web browser?

import requests
from bs4 import BeautifulSoup

url = ''
res = requests.get(url)
html_page = res.content
soup = BeautifulSoup(html_page, 'html.parser')
text = soup.find_all(text=True)

output = ''
blacklist = [
    '[document]',
    'noscript',
    'header',
    'html',
    'meta',
    'head', 

]

for t in text:
    if t.parent.name not in blacklist:
        output += '{} '.format(t)

print(output)
  • Assume you want to display whatever store in output variable in HTML as <pre> .
  • You can write this text to HTML file between <pre> and <\pre> .
with open('display.html', 'w') as f:
    f.write("<pre>\n")
    for line in output.split("\n"):
        f.write(line)
        f.write("\n")
    f.write("<\\pre>")
  • now this will generate a file called display.html which will display all the text in output variable to web.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM