简体   繁体   English

如何从 html 内容中打印特定值?

[英]how to print a particular value from html content?

below is HTML content I want the only value that is available in HTML content下面是 HTML 内容我想要 HTML 内容中唯一可用的值

    <div class="list-group-item">
     <div class="row">
      <div class="col" style="min-width: 0;">
       <h2 class="h5 mt-0 text-truncate">
        <a class="text-warning" href="www.example.com">
         Ram
        </a>
       </h2>
       <p class="mob-9 text-truncate">
        <small>
         <i class="fa fa-fw fa-mobile-alt">
         </i>
         Contact:
        </small>
        010101010
       </p>
       <p class="mb-2 text-truncate">
        <small>
         <i class="fa fa-fw fa-map-marker-alt">
         </i>
         Location:
        </small>
        5th lane, kamathipura, Kamathipura
       </p>
        </a>
       </p>
      </div>
     </div>
    </div>

my code is -我的代码是 -

import pandas as pd
import requests
from bs4 import BeautifulSoup as soup
url = requests.get("www.example.com")
page_soup = soup(url.content, 'html.parser')
name = shop.findAll("div", {"class": "list-group-item"})
print(name.h2.text)
number = shop.findAll("p", {"class": "fa fa-fw fa-map-marker-alt"})
print(?)
location = shop.findAll("p", {"class": "fa fa-fw fa-map-marker-alt"})
print(?)

I need output for this by using python -为此,我需要 output 使用 python -

'Ram', '010101010', '5th lane, kamathipura, Kamathipura' 'Ram', '010101010', '第 5 车道,kamathipura,kamathipura'

Have you tried location.get_text() ?您是否尝试过location.get_text()

You can go here and read more about it.您可以在此处阅读 go 并阅读更多相关信息。

Using the tags and class identifiers, you can grab all contents within the regions you want.使用标签和 class 标识符,您可以获取所需区域内的所有内容。 Then with content indicies you should be able to select the exact content you wish like this:然后使用内容索引,您应该能够 select 您希望这样的确切内容:

from bs4 import BeautifulSoup
url = 'myhtml.html'
with open(url) as fp:
    soup = BeautifulSoup(fp, 'html.parser')
    contnt1 = [soup.find('a').contents[0].replace(' ','').replace('\n','')]
    contnt2 = [x.contents[2].replace(' ', '').replace('\n', '') for x in soup.find_all("p", "text-truncate")]
    print(*(contnt1 + contnt2))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM