繁体   English   中英

尝试在 Python 中使用 Selenium 进行循环

[英]Trying make a loop with Selenium in Python

我有一个代码要在这个网站上搜索 --> https://osu.ppy.sh/beatmapsets?m=0只映射我想要的困难地图,但我无法正确循环

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
from time import sleep

# Set link and path
driver = webdriver.Chrome(executable_path=r"C:\Users\Gabri\anaconda3\chromedriver.exe")
driver.get("https://osu.ppy.sh/beatmapsets?m=0")
wait = WebDriverWait(driver, 20)

# Variables, lists and accountants
lista = {}
links, difficulty, maps2, final = [], [], [], []
line, column, = 1, 1
link_test = ''

n = int(input('insert how many maps do you want: '))
c = 1

# Open link in Chrome and search map by map
while True:
    if c > n:
        break
    sleep(1)
    wait.until(EC.element_to_be_clickable(
        (By.CSS_SELECTOR, f".beatmapsets__items-row:nth-of-type(1)>.beatmapsets__item:nth-of-type(1)")))
    games = driver.find_element_by_css_selector(
        f".beatmapsets__items-row:nth-of-type({line}) .beatmapsets__item:nth-of-type({column}) .beatmapset-panel__info-row--extra")
    actions = ActionChains(driver)
    actions.move_to_element(games).perform()
    wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".beatmaps-popup__group")))
    scores = driver.find_elements_by_css_selector(
        ".beatmaps-popup__group .beatmaps-popup-item__col.beatmaps-popup-item__col--difficulty")

    # This part i can't makes automatic, for example, if i wanted to show 6 maps i would have to add 2 more if's
    # Changing the variable (line) and (column) accordingly

    # I liked to have a loop with 'while' or 'for ... in' but i don't know how make it
    # I tried to do a question before start the code like 'how many maps do you want?' and this number would be the times that code would execute
    # But no work it =(

    if c % 2 != 0:
        column = 2
        if c % 2 == 0:
            line += 1
    else:
        line += 1
        column = 1

        # Convert string to float (difficulty numbers)
    for score in scores:
        a = score.text
        b = a.replace(',', '.')
        difficulty.append(float(b))

    # Save in list 'links' each link corresponding of map that is printing
    games.click()
    sleep(3)
    link_test = driver.current_url
    links.append(link_test)
    link_test = ''
    driver.back()

    # Dict with map, link and difficulty
    lista = {
        'map': f"{c}",
        'link': f"{links}",
        'difficulty': f"{difficulty}"}
    c += 1
    # Print each map in dict 'lista'
    print(f"Map: {lista['map']}\nLink: {links}\nDifficulty: {lista['difficulty']}\n")

    # This part is my filter, if map have difficulty 6.00 or more, it's add to list 'final' for download
    for b in difficulty:
        if b >= 6.00:
            # This slice, the link had printing error 'TypeError: unhashable type: 'list'', i found this way to solve it
            # I know that is not the best way to solve this error, but at least i tried =,)
            xam = str(links[0])
            xam1 = xam.replace("'", '')
            xam2 = xam1.replace("[", '')
            xam3 = xam2.replace("]", '')
            final.append(xam3)

    # Clean all lists for no have duplicate items in dict 'lista' when next map is selected
    difficulty.clear()
    lista.clear()
    links.clear()

# Print how many maps with difficulty 6.00 has been found
print(f'There are {len(sorted(set(final)))} maps to download')

# This question is for future download, im still coding this part, so u can ignore this =3
pergunta = input('Do you want to download them? \n[ Y ]\n[ N ]\n>>> ').lower().strip()

# Clean duplicate links and show all links already filtered
if pergunta == 'y':
    for x in final:
        maps2.append(x)
    print(sorted(set(maps2)))

在“如果”部分,我需要帮助使其自动化,就像我所做的那样,对许多“如果”没有用处。 使用带有'v += n'的变量可能? 身份证;-;

PS-如果您发现任何逻辑错误或以某种方式优化我的代码,我将很乐意学习并修复它

你做的工作比你必须做的要多。 When you visit the page in a browser, and log your network traffic, everytime you scroll down to load more beatmaps you'll see some XHR (XmlHttpRequest) HTTP GET requests being made to a REST API, the response of which is JSON and contains您可能想要的所有谱面信息。 您需要做的就是模仿 HTTP GET 请求 - 不需要 Selenium:

def get_beatmaps():
    import requests

    url = "https://osu.ppy.sh/beatmapsets/search"

    params = {
        "m": "0",
        "cursor[approved_date]": "0",
        "cursor[_id]": "0"
    }

    while True:
        response = requests.get(url)
        response.raise_for_status()

        data = response.json()

        cursor_id = data["cursor"]["_id"]
        if cursor_id == params["cursor[_id]"]:
            break
        
        yield from data["beatmapsets"]
        params["cursor[approved_date]"] = data["cursor"]["approved_date"]
        params["cursor[_id]"] = cursor_id


def main():
    from itertools import islice

    num_beatmaps = 10 # Get info for first ten beatmaps

    beatmaps = list(islice(get_beatmaps(), num_beatmaps))

    for beatmap in beatmaps:
        print("{} - {}".format(beatmap["artist"], beatmap["title"]))
        for version in beatmap["beatmaps"]:
            print("    [{}]: {}".format(version["version"], version["difficulty_rating"]))
        print()

    return 0


if __name__ == "__main__":
    import sys
    sys.exit(main())

Output:

Aitsuki Nakuru - Monochrome Butterfly
    [Gibune's Insane]: 4.55
    [Excitement]: 5.89
    [Collab Extra]: 5.5
    [Hard]: 3.54
    [Normal]: 2.38

Sweet Trip - Chocolate Matter
    [drops an end to all this disorder]: 4.15
    [spoken & serafeim's hard]: 3.12

Aso Natsuko - More-more LOVERS!!
    [SS!]: 5.75
    [Sonnyc's Expert]: 5.56
    [milr_'s Hard]: 3.56
    [Dailycare's Insane]: 4.82

Takayan - Jinrui Mina Menhera
    [Affection]: 4.43
    [Normal]: 2.22
    [Narrative's Hard]: 3.28

Asaka - Seize The Day (TV Size)
    [Beautiful Scenery]: 3.7
    [Kantan]: 1.44
    [Seren's Oni]: 3.16
    [XK's Futsuu]: 2.01
    [ILOVEMARISA's Muzukashii]: 2.71
    [Xavy's Seize The Moment]: 4.06

Swimy - Acchi Muite (TV Size)
    [Look That Way]: 4.91
    [Azu's Cup]: 1.72
    [Platter]: 2.88
    [Salad]: 2.16
    [Sya's Rain]: 4.03

Nakazawa Minori (CV: Hanazawa Kana) - Minori no Zokkon Mirai Yohou (TV Size)
    [Expert]: 5.49
    [Normal]: 2.34
    [Suou's Hard]: 3.23
    [Suou's Insane]: 4.38
    [Another]: 4.56

JIN - Children Record (Re:boot)
    [Collab Hard]: 3.89
    [Maki's Normal]: 2.6
    [hypercyte & Seto's Insane]: 5.01
    [Kagerou]: 6.16

Coalamode. - Nemophila (TV Size)
    [The Hidden Dungeon Only I Can Enter]: 3.85
    [Silent's Hard]: 3
    [Normal]: 2.29

MISATO - Necro Fantasia
    [Lunatic]: 6.06

>>>

这个例子现在的写法是,它从 API 中获取前十个谱面图,打印出艺术家和标题,以及每个版本的谱面图的名称和难度。 您可以根据需要更改它,并根据难度过滤 output。

话虽如此,我对 OSU 或谱面图一无所知。 如果您能描述最终的 output 应该是什么样子,我可以定制我的解决方案。

在进行大量测试之前,我解决了所有问题(现在呵呵)。 只需添加

    if c % 2 != 0:
        column = 2
        if c % 2 == 0:
            line += 1
    else:
        line += 1
        column = 1

我非常感谢所有帮助过我的人 =)))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM