简体   繁体   中英

Beautiful Soup 4 parse line

How would I parse the following line with BS4 and python3? I would like to extract "BH" and "Bahrain".

<li><input class="checkboxSelect2" name="countries[]" type="checkbox" value="BH"/> Bahrain</li>

I can get "Bahrain", but I can't get "BH"

for l in allCountries.findAll("li"):
  print(l.value)
  print(l.text)

l.text will return Bahrain but l.value is not valid and throws an error.

I figured it out how to access the attributes. i needed to first do

l.input

Then I needed to call the attribute

l.input.attrs['value']

My final code for extracting all Country names and Country Abreviations

for l in allCountries.findAll("li"):
  try:
    print("Country: {0} ABR: {1}".format(l.text, l.input.attrs['value']))
  except:
    pass

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM