通過文本和跨度提取正確的元素 / Beautiful Soup / Python

Question

我試圖抓取以下數據：

菜式：4.5
服務：4.0
質量：4.5

但是我在抓取正確的數據時遇到了問題。 我嘗試了以下兩個代碼：

for bewertungen in soup.find_all('div', {'class' : 'histogramCommon bubbleHistogram wrap'}):

        if bewertungen.find(text='Cuisine'):
            cuisine = bewertungen.find(text='Cuisine')
            cuisine = cuisine.next_element
            print("test " + str(cuisine))

        if bewertungen.find_all(text='Service'):
            for s_bewertung in bewertungen.find_all('span', {'class':'ui_bubble_rating'}):
            s_speicher = s_bewertung['alt']

首先，如果我沒有結果。 在第二個如果我得到了正確的元素，但我得到了所有 3 個結果，但我無法定義哪些屬於哪個文本（美食、服務、質量）

有人可以給我建議如何獲得正確的數據嗎？ 我把 html 代碼放在底部。

<div class="histogramCommon bubbleHistogram wrap">
   <div class="colTitle">\nGesamtwertung\n</div>
   <ul class="barChart">
      <li>
         <div class="ratingRow wrap">
            <div class="label part ">
               <span class="text">Cuisine</span>
            </div>
            <div class="wrap row part ">
               <span alt="4.5 of five" class="ui_bubble_rating bubble_45"></span>
        </div>
     </div>
     <div class="ratingRow wrap">
        <div class="label part ">
           <span class="text">Service</span>
        </div>
        <div class="wrap row part ">
           <span alt="4.0 of five" class="ui_bubble_rating bubble_40"></span>
            </div>
         </div>
      </li>
      <li>
         <div class="ratingRow wrap">
            <div class="label part ">
               <span class="text">Quality</span>
            </div>
            <div class="wrap row part "><span alt="4.5 of five" class="ui_bubble_rating bubble_45"></span></div>
         </div>
      </li>
   </ul>
</div>

Answer 1

嘗試這個。 根據您在上面粘貼的代碼段，以下代碼應該可以工作：

from bs4 import BeautifulSoup

soup = BeautifulSoup(content,"lxml")
for item in soup.select(".ratingRow"):
    category = item.select_one(".text").text
    rating = item.select_one(".row span")['alt'].split(" ")[0]
    print("{} : {}".format(category,rating))

另一種方法是：

for item in soup.select(".ratingRow"):
    category = item.select_one(".text").text
    rating = item.select_one(".text").find_parent().find_next_sibling().select_one("span")['alt'].split(" ")[0]
    print("{} : {}".format(category,rating))

輸出：

Cuisine : 4.5
Service : 4.0
Quality : 4.5

通過文本和跨度提取正確的元素 / Beautiful Soup / Python

問題描述

1 個解決方案

解決方案1
0 2018-02-06 20:04:17

通過文本和跨度提取正確的元素 / Beautiful Soup / Python

問題描述

1 個解決方案

解決方案1 0 2018-02-06 20:04:17

解決方案1
0 2018-02-06 20:04:17