Using python to scrape a webpage and display the results to .html

Question

I have created a script to display all present vacancies from a website. This works well and will print the list vertically via SSH.

However, what I now need to do is save this output as an unordered list and save it to a .html page.

The script I am using is:

from lxml import html
import requests

page = requests.get('https://www.fasthosts.co.uk/careers/current-vacancies').text

tree = html.fromstring(page.content)

Vacancies = tree.xpath('//h1[@class="featuredvacancy__title featuredvacancy__title--invert grid-16 alpha"]/text()')

print Vacancies

This will print the output to screen.

However my other script:

import requests
from bs4 import BeautifulSoup

url = 'https://www.fasthosts.co.uk/careers/current-vacancies'
response = requests.get(url)
html = response.content

    soup = BeautifulSoup(response.content, 'html.parser')

output = soup.find ('//h1[@class="featuredvacancy__title featuredvacancy__title--invert grid-16 alpha"]/text()')
text, link = output.text, output.get('vacancy.html')

Returns this error:

File "test2.py", line 11, in
text, link = output.text, output.get('vacancy.html') AttributeError: 'NoneType' object has no attribute 'text'

I have now resolved saving the output to a .html file using the following script:

from lxml import html
import requests
import urllib2

page = requests.get('https://www.fasthosts.co.uk/careers/current-vacancies')
content = html.fromstring(page.content)
Vacancies = content.xpath('//h1[@class="featuredvacancy__title featuredvacancy__title--invert grid-16 alpha"]/text()')

f = open('vacancy.html', 'w')
f.write(str(Vacancies))
f.close

Answer 1

The problem was resolved by saving the output to a .html file, using the following script:

from lxml import html
import requests
import urllib2

page = requests.get('https://www.fasthosts.co.uk/careers/current-vacancies')
content = html.fromstring(page.content)
Vacancies = content.xpath('//h1[@class="featuredvacancy__title featuredvacancy__title--invert grid-16 alpha"]/text()')

f = open('vacancy.html', 'w')
f.write(str(Vacancies))
f.close

Based on the OP 's edit to their post (and likely influenced by the comment of @user3080953 ) .

Using python to scrape a webpage and display the results to .html

Question

1 answers

solution1
0 2017-10-27 22:44:40

Using python to scrape a webpage and display the results to .html

Question

1 answers

solution1 0 2017-10-27 22:44:40

solution1
0 2017-10-27 22:44:40