Is there any way to access <option> text when parsing forms using lxml.html?

Question

I'm trying parse a html form that looks like that:

<select name="country">
<option value="1">Afghanistan</option>
<option value="2">Albania</option>
<option value="3">Algeria</option>
<option value="4">Andorra</option>
....
</select>

After I parse the document using lxml.html.parse, I can access the list of values using:

doc.forms[0].elements["country"].value_options

However, this returns a list of raw values (['1', '2', '3', '4' ...]) without the corresponding country names. Is there an easy way to get the contents of the option tag including both text and values?

Answer 1

I use xpath to get go through html... try:

options = doc.xpath("//select[@name='country']/option")
option_text = [option.text for option in options]

Is there any way to access <option> text when parsing forms using lxml.html?

Question

1 answers

solution1
1 ACCPTED 2012-08-20 10:50:27

Is there any way to access <option> text when parsing forms using lxml.html?

Question

1 answers

solution1 1 ACCPTED 2012-08-20 10:50:27

solution1
1 ACCPTED 2012-08-20 10:50:27