简体   繁体   中英

How to get all possible combinations of checkbox from a webpage with selenium

I am trying to do a web scraping with selenium to make a dataset from this website . What I want to achieve is to get every possible combinations of "pain is", "pain located in", etc and then save the result (possible causes) in a dataframe (csv file). I've figure out how to select the checkbox but I have no idea how to try all combinations of that checkboxes automatically.

I found this answer but I don't know how to write it in python because I'm not familiar with java.

Any kind of helps will be much appreciated. Thank you.

from selenium import webdriver
import itertools
import time


def create_subset(t):
    r = []
    for L in range(0, t+1):
        for subset in itertools.combinations(range(1,t+1), L):
            if subset != ():
                r.append(subset)
    return r


driver = webdriver.Chrome()
driver.get("https://www.mayoclinic.org/symptom-checker/abdominal-pain-in-adults-adult/related-factors/itt-20009075")
pain_count = len(driver.find_elements_by_class_name("frm_options")[0].find_elements_by_tag_name("li"))
pain_located_count = len(driver.find_elements_by_class_name("frm_options")[1].find_elements_by_tag_name("li"))


pain_checkbox = create_subset(pain_count)
location_checkbox = create_subset(pain_located_count)

for pain_option in pain_checkbox:
    for number in pain_option:
        driver.find_elements_by_class_name("frm_options")[0].find_elements_by_tag_name("li")[number - 1].click()
        time.sleep(0.05)
    for pain_location in location_checkbox:
        for number in pain_location:
            driver.find_elements_by_class_name("frm_options")[1].find_elements_by_tag_name("li")[number - 1].click()

        # this is just for visual presentation
        for number in pain_location:
            driver.find_elements_by_class_name("frm_options")[1].find_elements_by_tag_name("li")[number - 1].click()
    # this is just for visual presentation
    for number in pain_option:
        driver.find_elements_by_class_name("frm_options")[0].find_elements_by_tag_name("li")[number - 1].click()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM