简体   繁体   中英

scraping reviews from multiple pages

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from lxml import html
from selenium.webdriver.chrome.options import Options
import pandas as pd
import time

options = Options()
Data = pd.DataFrame([],columns=['Name', 'Reviews'])
driver = webdriver.Chrome(ChromeDriverManager().install(), options=options)
url='https://www.daraz.pk/products/metro-70cc-motorcycle-mr70-tez-raftar-red-black-            motorbike-lahore-gujranwala-sialkot-islamabad-karachi-faisalabad-only-i134510726-s1294898802.html?spm=a2a0e.searchlist.list.1.39e6276a19MyPa&search=1'
driver.get(url)
for j in range(5):
count=1
for i in range(5):
    product_reviews = driver.find_element_by_xpath('//*    [@id="module_product_review"]/div/div[3]/div[1]/div['+str(count)+']/div[3]/div[1]').text.replace('\n','')
    name = driver.find_element_by_xpath('//*[@id="module_product_review"]/div/div[3]/div[1]/div['+str(count)+']/div[2]/span[1]').text[3:]
    Data.loc[count-1] = [name,product_reviews]
    count+=1
 click_next = driver.find_element_by_xpath('//*[@id="module_product_qna"]/div[2]/div[2]/div[2]/div/button[2]')
 driver.execute_script("arguments[0].click();", click_next)

DIDN'T scrap reviews from multiple pages

DIDN'T scrap reviews from multiple pages DIDN'T scrap reviews from multiple pages DIDN'T scrap reviews from multiple pages DIDN'T scrap reviews from multiple pages

There is a better way to append the list to the end of columns but I forgot the right way. So I just used your for loop. Your next button path was the wrong one it was located higher up in the page. Also count should be outside the for loop otherwise it would overwrite.

Data = pd.DataFrame([],columns=['Name', 'Reviews'])
url='https://www.daraz.pk/products/metro-70cc-motorcycle-mr70-tez-raftar-red-black-motorbike-lahore-gujranwala-sialkot-islamabad-karachi-faisalabad-only-i134510726-s1294898802.html?spm=a2a0e.searchlist.list.1.39e6276a19MyPa&search=1'
driver.get(url)
count=1
for j in range(5):
    names = [x.text[3:] for x in driver.find_elements_by_xpath("//*[@class='item']/div[@class='middle']/span[1]")]
    ##print(names)
    product_reviews = [x.text.replace('\n','') for x in driver.find_elements_by_xpath("//*[@class='item']/div[@class='item-content']/div[@class='content']")]
    ##print(product_reviews)
    for i in range(5):
        Data.loc[count-1] = [names[i],product_reviews[i]]
        count+=1
    
    try:
        click_next = driver.find_element_by_css_selector('button.next-btn.next-btn-normal.next-btn-medium.next-pagination-item.next')
        driver.execute_script("arguments[0].click();", click_next)
        time.sleep(1)
    except:
        break
print(Data)

Outputs:

                 Name                                            Reviews
0          Shabbir A.  with the positive approach of management of Me...
1               W***.  Very good motorcycle. Great experience of buyi...
2   sohailnazir607607  Bike condition was 100% okay. Delivery was goo...
3            Bilal S.  I think Metro company is giving us a chance to...
4           Farooq A.  Product quality is Good. Before time delivery....
5               W***.  دراز سے میٹرو موٹرسائیکل کا یہ سپیشل ماڈل خرید...
6               a***.  Dam Hai bike mein Metro motorcycle ki kya baat...
7               a***.  My new metro bike, i have purchased from daraz...
8            adnan B.  I like metro motorcycle beacuse i have already...
9            Imran S.  Satisfied from seller, Next day he deliver ho ...
10              3***5  I proud to have this bike.Thanks metro Thanks ...
11              3***6                    engine sound is good. surprised
12              3***6                       rush deliveryquality product
13              3***6                          Thanks Daraz Thanks metro
14            Mian S.  Thx metro and draz for providing good service ...
15            afaq A.                              good dilevery on time
16              3***6                                   dilevery on time
17              3***6                                   dilevery on time
18            afaq A.                                      Good service 
19           Hamza S.  i  am 100% satisfied with my metro motorcycle ...
20           Abdul M.                 Quality product, reasonable price 
21               azar                               excellence Services 
22            Mian S.  Very good service and very nice bike thanks me...
23           Imran A.  experience was good to purchase high rating br...
24         Murtaza M.  Mashallah! Delivered in 10/10 condition. Much ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM