简体   繁体   English

如何使用 Selenium Python 保存播放音频

[英]How to save playing audio with Selenium Python

I'm developing a captcha solver using IBM Watson and all is well, I just need to save the playing audio to a file which can be then resolved using watson.我正在使用 IBM Watson 开发验证码求解器,一切都很好,我只需要将正在播放的音频保存到一个文件中,然后可以使用 watson 解析该文件。 I don't know how to go about that and I didn't find anything here.我不知道如何 go 关于那个,我在这里没有找到任何东西。 If possible I don't want some complicated requests etc, just save the playing audio to a file.如果可能的话,我不想要一些复杂的请求等,只需将播放的音频保存到文件中。 Or download the audio, but I tried using chrome_options to set download location, but it just didn't work Any help will be really appreciated或下载音频,但我尝试使用 chrome_options 设置下载位置,但它不起作用任何帮助将不胜感激

my code:我的代码:

import os
import time
import random
import requests
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from multiprocessing import Process
import ibm_watson
from ibm_watson import SpeechToTextV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
from ibm_watson.websocket import RecognizeCallback, AudioSource

chrome_options = Options()
chrome_options.add_argument("--mute-audio")
chrome_options.add_argument ("download.default_directory=/home/valentino/")
driver = webdriver.Chrome(options=chrome_options)

apikeywatson = 'C2f79A8ENbeUmWw-1DwTMd_v4IgCdCjqKpx21PsRaKan'
urlwatson = 'https://api.eu-de.speech-to-text.watson.cloud.ibm.com/instances/9a22253e-7fc5-4c67-b85b-5ad54db8282d'
authibm = IAMAuthenticator(apikeywatson)
stt = SpeechToTextV1(authenticator=authibm)
stt.set_service_url(urlwatson)

driver.get('https://client-demo.arkoselabs.com/github')
time.sleep(4)
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR, "iframe[src^='https://client-api.arkoselabs.com/fc/gc/']")))
time.sleep(2)
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "span[class='fc_meta_audio_btn']"))).click()
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, "audio_play"))).click()

I believe I have been through a similar situation.我相信我也经历过类似的情况。 If your file is successfully downloading, but downloading in default directory, and not your desired directory, I will let you know what and how can you get around this issue.如果您的文件已成功下载,但在默认目录中下载,而不是您想要的目录,我会告诉您如何解决此问题以及如何解决此问题。

  1. Dont use relative path, try using absolute path:不要使用相对路径,尝试使用绝对路径:

    chrome_options.add_argument (f"download.default_directory={}/home/valentino/")

  2. That will probably not work, try replacing forward slash with backward slash:这可能行不通,尝试用反斜杠替换正斜杠:

    chrome_options.add_argument (f"download.default_directory={}\\home\\valentino\\")

  3. If that worked for you, you are good to go.如果这对您有用,那么您对 go 很好。 But it didn't work for me.但这对我不起作用。 I had to adopt an ugly turnaround for this problem by manually moving the file from downloaded folder to my desired folder.我不得不通过手动将文件从下载的文件夹移动到我想要的文件夹来解决这个问题。 You can use something like this:你可以使用这样的东西:

    from shutil import move

    #verify this path as it varies from OS to OS
    default_file_download_path = 'C:\\Users\\UserName\\Downloads\\' 
    destination_path = 'home\\valentino\\'
    
    downloaded_file_name = [x for x in os.listdir(default_file_download_path)
                            if "audio_verification_challenge" in x][0]
    
    move(default_file_download_path+downloaded_file_name , destination_path+downloaded_file_name)

Yes, the last option probably looks very ugly but it was the only way I could make it work for my use case.是的,最后一个选项可能看起来很丑陋,但这是我使它适用于我的用例的唯一方法。


UPDATE更新

If you closely inspect the HTML, they provide a sweet SRC link for every audio file.如果您仔细检查 HTML,它们会为每个音频文件提供一个甜蜜的 SRC 链接。 You need to retrieve the file from that SRC by using simple requests call and then save it in locally.您需要使用简单的请求调用从该 SRC 检索文件,然后将其保存在本地。 I believe this is the easiest and fastest way.我相信这是最简单和最快的方法。

driver.get('https://client-demo.arkoselabs.com/github')
time.sleep(4)
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR, "iframe[src^='https://client-api.arkoselabs.com/fc/gc/']")))

audio_src = driver.find_element_by_xpath('//audio[@preload="auto"]').get_attribute('src')
content = requests.get(audio_src).content
# save the content into a file where you would want to
open('your_desired_location\\captcha_file.wav', 'wb').write(content)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM