How to scrape movies information from the IMDB website?

Question

I am new with Python and trying to scrape IMDB. I am scraping a list of 250 top IMDB movies and want to get information on each unique website for example the length of each movie.

I already have a list of unique URLs. So, I want to loop over this list and for every URL in this list I want to retrieve the 'length' of that movie. Is this possible to do in one code?

for URL in urlofmovie:
    htmlsource = requests.get(URL)
    tree_url = html.fromstring(htmlsource)
    lengthofmovie = tree_url.xpath('//*[@class="subtext"]')

I expect that lengthofmovie will become a list of all the lengths of the movies. However, it already goes wrong at line 2: the htmlsource .

Answer 1

To make it a list you should first create a list and then append each length to that list.

length_list = []
for URL in urlofmovie:
    htmlsource = requests.get(URL)
    tree_url = html.fromstring(htmlsource)
    length_list.append(tree_url.xpath('//*[@class="subtext"]'))

Small tip : Since you are new to Python I would suggest you to go over PEP8 conventions . Your variable naming can make your(and other developers) life easier. (urlofmovie -> urls_of_movies)

However, it already goes wrong for at line 2: the htmlsource.

Please provide the exception you are receiving.

How to scrape movies information from the IMDB website?

Question

1 answers

solution1
2 2019-05-13 11:14:40

How to scrape movies information from the IMDB website?

Question

1 answers

solution1 2 2019-05-13 11:14:40

solution1
2 2019-05-13 11:14:40