简体   繁体   中英

I can't get a value of HTML tag using beautifulsoup python

Hey there is a website that I'm trying to scrape and there are values in the inputs that doesn't scrape as text ONLY HTML Like this

<input class="aspNetDisabled" disabled="disabled" id="ContentPlaceHolder1_EmpName" name="ctl00$ContentPlaceHolder1$EmpName" style="color:#003366;background-color:#CCCCCC;font-weight:bold;height:27px;width:150px;" type="text" value="John Doe"/>

So what I want to do is just getting the Value ( John Doe ) I tried to put.text But it's not scraping it This is the code

soup=BeautifulSoup(r.content,'lxml')
    for name in soup.findAll('input', {'name':'ctl00$ContentPlaceHolder1$EmpName'}):
            with io.open('x.txt', 'w', encoding="utf-8") as f:
                f.write (name.prettify())

The reason you are not getting a result when calling .text is since the "John Doe", is not in the text on the HTML, it's an HTML attribute : value="John Doe" .

You can access the attribute like a Python dictionary ( dict ) using tag[<attribute>] . (See the BeautifulSoup documentation on attributes ).

html = """<input class="aspNetDisabled" disabled="disabled" id="ContentPlaceHolder1_EmpName" name="ctl00$ContentPlaceHolder1$EmpName" style="color:#003366;background-color:#CCCCCC;font-weight:bold;height:27px;width:150px;" type="text" value="John Doe"/>"""

soup = BeautifulSoup(html, "lxml")
for name in soup.findAll("input", {"name": "ctl00$ContentPlaceHolder1$EmpName"}):
    print(name["value"])

Output:

John Doe

While the answer from MendelG works great, it could be a bit cleaner without using a for loop ( if you want to extract only one element ):

>>> soup.find('input')['value']
John Doe

Code:

from bs4 import BeautifulSoup

string = '''
<input class="aspNetDisabled" disabled="disabled" id="ContentPlaceHolder1_EmpName" name="ctl00$ContentPlaceHolder1$EmpName" style="color:#003366;background-color:#CCCCCC;font-weight:bold;height:27px;width:150px;" type="text" value="John Doe"/>
'''

soup = BeautifulSoup(string, 'html.parser')

john_come_here = soup.find('input')['value']
print(john_come_here)

>>> John Doe

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM