简体   繁体   English

webscraping selenium:我的循环总是只得到第一个元素

[英]webscraping selenium : my loop always get only the 1st element

Im trying to scrap this website : http://scrumquiz.org/#/scrum-master-practice-test我试图废弃这个网站: http ://scrumquiz.org/#/scrum-master-practice-test

I want in the end to get all the questions/answers and correct answers So here's my code which will get me to the end of the quizz with all the Q/A and correct answers我希望最终得到所有的问题/答案和正确答案所以这是我的代码,它将让我完成所有问答和正确答案的测验

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException, TimeoutException
from webdriver_manager.chrome import ChromeDriverManager
#import time
import json
import pandas as pd

driver = webdriver.Chrome('C:/Users/Ihnhn/Documents/WebScrap/Selenium/chromedriver.exe')
driver.get("http://scrumquiz.org/#/scrum-master-practice-test") #démarre la page 
driver.maximize_window()#met en full screen
#démarre le quizz
start_quizz = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"button[class='btn btn-primary btn-quiz-start']"))).click()
driver.execute_script("window.scrollTo(0,400);") #scroll jusqu'en bas de la question

for i in range(40):
    next_button = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"button[class='btn btn-primary']"))).click()

complete_quizz = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH,"//button[contains(.,'Complete quiz')]"))).click()

then, Im trying with this code to click on each question, get all the informations needed and get back to the Quiz result page, but it always go back to the first question.然后,我尝试使用此代码单击每个问题,获取所需的所有信息并返回测验结果页面,但它总是回到第一个问题。 So it gives me 40 times same question.所以它给了我 40 次相同的问题。 it is a list but always gets me the first element ?它是一个列表,但总是让我成为第一个元素?

(I have just tried to get the question name for now) (我现在刚刚尝试获取问题名称)

all_questions = driver.find_elements_by_xpath("//div[@class='quiz-answer wrong-answer']")
driver.execute_script("window.scrollTo(0,3000);") #scroll jusqu'en bas de la question

for q in all_questions:
    click_question = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH,"//div[@class='quiz-answer wrong-answer']"))).click()
    #time.sleep(2)
    driver.execute_script("window.scrollTo(0,400);") #scroll jusqu'en bas de la question
    nom_question = driver.find_element_by_xpath("//div[contains(@class,'question-title text-center')]/h3").text
    print(nom_question)
    #time.sleep(2)
    back_question = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH,"//button[contains(.,'Back to the results')]"))).click()

I've tried to replace "driver" by "q" in the elements to be clickable but doesnt work.我试图将元素中的“驱动程序”替换为“q”以使其可点击但不起作用。

Well, this might be a case of an XY Problem .好吧,这可能是一个XY 问题

I know it's not what you asked for, but anyway:我知道这不是你要求的,但无论如何:

By looking at the network requests the browser sends after visiting the page, (f12 -> Network) you can see one for /scrum-master.json .通过查看浏览器在访问页面后发送的网络请求(f12 -> Network),您可以看到/scrum-master.json的请求。

在此处输入图像描述

So, by visiting scrum-master.json you can download the file and parse it.因此,通过访问scrum-master.json ,您可以下载该文件并对其进行解析。 (for example, with python's json module ). (例如,使用 python 的json 模块)。

it seems the resolution field contains the indices of the correct answer(s).似乎resolution字段包含正确答案的索引。 always the top option, or if it's a multiple choice, top 2.总是首选,或者如果是多项选择,则排在前 2。

The order of answers is probably shuffled in the client-side JS.答案的顺序可能在客户端 JS 中被打乱了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM