網頁抓取問題，如何在一個 html 文件中顯示來自 2 個不同站點的數據

Question

我在同一個 html 文件（在一張表中）顯示來自 2 個不同站點的數據時遇到問題。

我一直在嘗試很多事情，尋找任何解決方案，但沒有任何幫助。 您還可以將任何 python/bs/web-scraping 教程鏈接到我未來的“問題”：D

提前致謝。

代碼：

import pandas as pd
import requests
from bs4 import BeautifulSoup

odpowiedz = requests.get(
    "https://www.nike.com/pl/w?q=react%20270&vst=react%20270")
soup = BeautifulSoup(odpowiedz.text, 'html.parser')

items = soup.find_all(
    class_='product-card css-1pclthi ncss-col-sm-6 ncss-col-lg-4 va-sm-t product-grid__card')

title = [item.find(class_='product-card__title').get_text()
         for item in items]
price = [item.find(class_='product-card__price').get_text()
         for item in items]
linki = [item.find(class_='product-card__link-overlay').attrs['href']
         for item in items]

odpowiedz = requests.get(
    "https://www.nike.com/pl/w/air-max-97-buty-77f38zy7ok")
soup = BeautifulSoup(odpowiedz.text, 'html.parser')

items = soup.find_all(
    class_='product-card css-1pclthi ncss-col-sm-6 ncss-col-lg-4 va-sm-t product-grid__card')

title = [item.find(class_='product-card__title').get_text()
         for item in items]
price = [item.find(class_='product-card__price').get_text()
         for item in items]
linki = [item.find(class_='product-card__link-overlay').attrs['href']
         for item in items]

wynik = pd.DataFrame(
    {
        'Model': title,
        'Cena': price,
        'Link': linki,
    })

print(wynik)
wynik.to_html('official.html')

該程序的結果是來自第一個網站 (nike react) 的 id、產品名稱、價格和鏈接（在本例中為鞋子），我想添加來自第二個站點（nike air max 97）的數據並將其添加到表中第一個結果（耐克反應）

Answer 1

肯定有更好的方法來做到這一點。 但這是一個快速的創可貼解決方案：-

import pandas as pd
import requests
from bs4 import BeautifulSoup


title = []
price = []
linki = []

odpowiedz = requests.get(
    "https://www.nike.com/pl/w?q=react%20270&vst=react%20270")
soup = BeautifulSoup(odpowiedz.text, 'html.parser')

items = soup.find_all(
    class_='product-card css-1pclthi ncss-col-sm-6 ncss-col-lg-4 va-sm-t product-grid__card')

title.append([item.find(class_='product-card__title').get_text()
         for item in items])
price.append([item.find(class_='product-card__price').get_text()
         for item in items])
linki.append([item.find(class_='product-card__link-overlay').attrs['href']
         for item in items])

odpowiedz = requests.get(
    "https://www.nike.com/pl/w/air-max-97-buty-77f38zy7ok")
soup = BeautifulSoup(odpowiedz.text, 'html.parser')

items = soup.find_all(
    class_='product-card css-1pclthi ncss-col-sm-6 ncss-col-lg-4 va-sm-t product-grid__card')

title.append([item.find(class_='product-card__title').get_text()
         for item in items])
price.append([item.find(class_='product-card__price').get_text()
         for item in items])
linki.append([item.find(class_='product-card__link-overlay').attrs['href']
         for item in items])

flat_titles = [titles for sublisttitle in title for titles in sublisttitle]
flat_prices = [prices for sublistprice in price for prices in sublistprice]
flat_links = [links for sublistlinks in linki for links in sublistlinks]

wynik = pd.DataFrame(
    {
        'Model': flat_titles,
        'Cena': flat_prices,
        'Link': flat_links,
    })

print(wynik)
wynik.to_html('official.html')

網頁抓取問題，如何在一個 html 文件中顯示來自 2 個不同站點的數據

問題描述

1 個解決方案

解決方案1
0 已采納 2020-03-15 07:17:19

網頁抓取問題，如何在一個 html 文件中顯示來自 2 個不同站點的數據

問題描述

1 個解決方案

解決方案1 0 已采納 2020-03-15 07:17:19

解決方案1
0 已采納 2020-03-15 07:17:19