将脚本部署到AWS Lambda的问题

Question

我遇到的问题是我正在尝试运行一个使用Selenium特别是webdriver的脚本。

driver = webdriver.Firefox(executable_path='numpy-test/geckodriver', options=options, service_log_path ='/dev/null')

我的问题是该功能需要geckodriver才能运行。 可以在我上载到AWS的zip文件中找到Geckodriver，但是我不知道如何获得在AWS上访问它的功能。 在本地，这不是问题，因为它在我的目录中，因此一切正常。

通过无服务器运行功能时，出现以下错误消息：

{“” ErrorMessage“：”消息：'geckodriver'可执行文件需要放在PATH中。\\ n“，” errorType“：” WebDriverException“，” stackTrace“：[[”“ /var/task/handler.py”，66，“ main“，” print（TatamiClearanceScrape（））“]，[” /var/task/handler.py"、28、"TatamiClearanceScrape"、"driver = webdriver.Firefox（executable_path ='numpy-test / geckodriver'，options =选项，service_log_path ='/ dev / null'）“]，[” /var/task/selenium/webdriver/firefox/webdriver.py“，164，” init “，” self.service.start（）“]，[ “ /var/task/selenium/webdriver/common/service.py”，83，“开始”，“ os.path.basename（self.path），self.start_error_message）”]]}

错误------------------------------------------------- -

调用功能失败

任何帮助，将不胜感激。

编辑：

def TatamiClearanceScrape():
    options = Options()
    options.add_argument('--headless')

    page_link = 'https://www.tatamifightwear.com/collections/clearance'
    # this is the url that we've already determined is safe and legal to scrape from.
    page_response = requests.get(page_link, timeout=5)
    # here, we fetch the content from the url, using the requests library
    page_content = BeautifulSoup(page_response.content, "html.parser")

    driver = webdriver.Firefox(executable_path='numpy-test/geckodriver', options=options, service_log_path ='/dev/null')
    driver.get('https://www.tatamifightwear.com/collections/clearance')

    labtnx = driver.find_element_by_css_selector('a.btn.close')
    labtnx.click()
    time.sleep(10)
    labtn = driver.find_element_by_css_selector('div.padding')
    labtn.click()
    time.sleep(5)
    # wait(driver, 50).until(lambda x: len(driver.find_elements_by_css_selector("div.detailscontainer")) > 30)
    html = driver.page_source
    page_content = BeautifulSoup(html)
    # we use the html parser to parse the url content and store it in a variable.
    textContent = []

    tags = page_content.findAll("a", class_="product-title")

    product_title = page_content.findAll(attrs={'class': "product-title"})  # allocates all product titles from site

    old_price = page_content.findAll(attrs={'class': "old-price"})

    new_price = page_content.findAll(attrs={'class': "special-price"})

    products = []
    for i in range(len(product_title) - 2):
        #  groups all products together in list of dictionaries, with name, old price and new price
        object = {"Product Name": product_title[i].get_text(strip=True),
                  "Old Price:": old_price[i].get_text(strip=True),
                  "New Price": new_price[i].get_text(), 'date': str(datetime.datetime.now())
                  }
        products.append(object)



    return products

Answer 1

您可能想要了解一下此功能的AWS Lambda层。 使用Lambda，您可以使用Lambda来使用库，而无需将它们包括在部署包中以实现功能。 分层可以避免您对代码的每次更改都上载依赖项，而只需创建一个包含所有必需软件包的附加层即可。

在此处阅读有关AWS Lambda层的更多详细信息

将脚本部署到AWS Lambda的问题

问题描述

1 个解决方案

解决方案1
0 2019-01-21 10:19:04

将脚本部署到AWS Lambda的问题

问题描述

1 个解决方案

解决方案1 0 2019-01-21 10:19:04

解决方案1
0 2019-01-21 10:19:04