[英]Scrape dependent filled Drop-down list
Using Excel VBA, I wish to scrape values from two Drop-down lists.使用 Excel VBA,我希望从两个下拉列表中抓取值。 One is filled with states' names, and another with cities.
一个是州名,另一个是城市。
I can scrape the states names, but when I try to scrape the cities names I get nothing.我可以抓取州名,但是当我尝试抓取城市名称时,我什么也得不到。 The cities list is filled accordingly to state selected.
城市列表根据所选的州填写。
How can I list every city on second drop-down list for each state in the first list?如何在第一个列表中的每个州的第二个下拉列表中列出每个城市?
This query gives me only state names and the default value of the second list:这个查询只给我状态名称和第二个列表的默认值:
Sub ScrapDropDown()
Const URL As String = "http://idebescola.inep.gov.br/ideb/consulta-publica"
Dim XMLPage As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
XMLPage.Open "GET", URL, False
XMLPage.send
HTMLDoc.body.innerHTML = XMLPage.responseText
Set HTMLDocment = HTMLDoc.getElementById("pkCodEstado")
For i = 1 To HTMLDocment.Length - 1
Set HTMLpkCodMunicipio = HTMLDoc.getElementById("pkCodMunicipio")
For Each HTMLMun In HTMLpkCodMunicipio.getElementsByTagName("option")
Debug.Print i & "-" & HTMLDocment(i).Value & "-" & HTMLDocment(i).innerText & "-" & HTMLMun.Value & "-" & HTMLMun.innerText
Next HTMLMun
Next i
End Sub
Part of HTML with drop list i want to scrape (three dots I removed another unwanted lists), where selected (on site) a state from the first list, without selecting id="pkCodMunicipio" have only one option我想刮掉带有下拉列表的 HTML 的一部分(我删除了另一个不需要的列表的三个点),其中从第一个列表中选择(现场)一个状态,而不选择 id="pkCodMunicipio" 只有一个选项
<form method="post" name="frm" class="classForm" id="frm">
<label for="pkCodEntidade">Por Código</label>
<div class="divRequired">
</div>
<input name="pkCodEntidade" id="pkCodEntidade" placeholder="Código da Escola" title="Por Código" class="onlynumbers" maxlength="8" tabindex="15" type="text" value="">
<hr>
<label id="lbl">Por área de interesse</label>
<div id="lblDivRequired" class="divRequired" style="display: ;">
</div>
<select name="pkCodEstado" id="pkCodEstado" tabindex="16">
<option value="">UF</option>
<option value="12">ACRE</option>
<option value="27">ALAGOAS</option>
<option value="16">AMAPÁ</option>
<option value="13">AMAZONAS</option>
<option value="29">BAHIA</option>
<option value="23">CEARÁ</option>
<option value="53">DISTRITO FEDERAL</option>
<option value="32">ESPÍRITO SANTO</option>
<option value="52">GOIÁS</option>
<option value="21">MARANHÃO</option>
<option value="51">MATO GROSSO</option>
<option value="50">MATO GROSSO DO SUL</option>
<option value="31">MINAS GERAIS</option>
<option value="15">PARÁ</option>
<option value="25">PARAÍBA</option>
<option value="41">PARANÁ</option>
<option value="26">PERNAMBUCO</option>
<option value="22">PIAUÍ</option>
<option value="33">RIO DE JANEIRO</option>
<option value="24">RIO GRANDE DO NORTE</option>
<option value="43">RIO GRANDE DO SUL</option>
<option value="11">RONDÔNIA</option>
<option value="14">RORAIMA</option>
<option value="42">SANTA CATARINA</option>
<option value="35">SÃO PAULO</option>
<option value="28">SERGIPE</option>
<option value="17">TOCANTINS</option>
</select>
<select name="pkCodMunicipio" id="pkCodMunicipio" tabindex="17">
<option value="">Municípios</option>
<option value="1400050">ALTO ALEGRE</option>
<option value="1400027">AMAJARI</option>
<option value="1400100">BOA VISTA</option>
<option value="1400159">BONFIM</option>
<option value="1400175">CANTA</option>
<option value="1400209">CARACARAI</option>
<option value="1400233">CAROEBE</option>
<option value="1400282">IRACEMA</option>
<option value="1400308">MUCAJAI</option>
<option value="1400407">NORMANDIA</option>
<option value="1400456">PACARAIMA</option>
<option value="1400472">RORAINOPOLIS</option>
<option value="1400506">SAO JOAO DA BALIZA</option>
<option value="1400605">SAO LUIZ</option>
<option value="1400704">UIRAMUTA</option>
</select>
...
<button name="btnSearch" class="btnDefault btn btn-warning" title="Buscar" type="submit" id="btnSearch" onclick="void(0);">Buscar</button>
</div>
<input type="hidden" name="undefined" value="undefined">
</form>
You can use a css selector combination.您可以使用 css 选择器组合。 The below use an id (
#
) selector to target the parent select
tag element in descendant combination with option
element selector to get all the child option
tag elements.下面使用 id (
#
) 选择器将父select
标记元素与option
元素选择器结合使用,以获取所有子option
标记元素。
Dim nodeList As Object, i As Long
Set nodeList = HTMLDoc.querySelectorAll("#pkCodEstado option")
For i = 0 To nodeList.Length-1
Debug.Print nodeList.item(i).innerText
Next
i
is already declared at the top so you don't actually need to declare again. i
已经在顶部声明,因此您实际上不需要再次声明。 You should use Option Explicit
at the top off all modules and thus declare all your variables.您应该在所有模块的顶部使用
Option Explicit
,从而声明所有变量。 You have a number of undeclared variables in your code.您的代码中有许多未声明的变量。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.