简体   繁体   English

抓取依赖填充下拉列表

[英]Scrape dependent filled Drop-down list

Using Excel VBA, I wish to scrape values from two Drop-down lists.使用 Excel VBA,我希望从两个下拉列表中抓取值。 One is filled with states' names, and another with cities.一个是州名,另一个是城市。

I can scrape the states names, but when I try to scrape the cities names I get nothing.我可以抓取州名,但是当我尝试抓取城市名称时,我什么也得不到。 The cities list is filled accordingly to state selected.城市列表根据所选的州填写。

How can I list every city on second drop-down list for each state in the first list?如何在第一个列表中的每个州的第二个下拉列表中列出每个城市?

This query gives me only state names and the default value of the second list:这个查询只给我状态名称和第二个列表的默认值:

Sub ScrapDropDown()
    Const URL As String = "http://idebescola.inep.gov.br/ideb/consulta-publica"
    Dim XMLPage As New MSXML2.XMLHTTP60
    Dim HTMLDoc As New MSHTML.HTMLDocument
    XMLPage.Open "GET", URL, False
    XMLPage.send
    HTMLDoc.body.innerHTML = XMLPage.responseText
    Set HTMLDocment = HTMLDoc.getElementById("pkCodEstado")
    For i = 1 To HTMLDocment.Length - 1
        Set HTMLpkCodMunicipio = HTMLDoc.getElementById("pkCodMunicipio")
        For Each HTMLMun In HTMLpkCodMunicipio.getElementsByTagName("option")
            Debug.Print i & "-" & HTMLDocment(i).Value & "-" & HTMLDocment(i).innerText & "-" & HTMLMun.Value & "-" & HTMLMun.innerText
        Next HTMLMun
    Next i
End Sub

Part of HTML with drop list i want to scrape (three dots I removed another unwanted lists), where selected (on site) a state from the first list, without selecting id="pkCodMunicipio" have only one option我想刮掉带有下拉列表的 HTML 的一部分(我删除了另一个不需要的列表的三个点),其中从第一个列表中选择(现场)一个状态,而不选择 id="pkCodMunicipio" 只有一个选项

<form method="post" name="frm" class="classForm" id="frm">
<label for="pkCodEntidade">Por Código</label>
<div class="divRequired">
</div>
<input name="pkCodEntidade" id="pkCodEntidade" placeholder="Código da Escola" title="Por Código" class="onlynumbers" maxlength="8" tabindex="15" type="text" value="">
<hr>
<label id="lbl">Por área de interesse</label>
<div id="lblDivRequired" class="divRequired" style="display: ;">
</div>
<select name="pkCodEstado" id="pkCodEstado" tabindex="16">
<option value="">UF</option>
<option value="12">ACRE</option>
<option value="27">ALAGOAS</option>
<option value="16">AMAPÁ</option>
<option value="13">AMAZONAS</option>
<option value="29">BAHIA</option>
<option value="23">CEARÁ</option>
<option value="53">DISTRITO FEDERAL</option>
<option value="32">ESPÍRITO SANTO</option>
<option value="52">GOIÁS</option>
<option value="21">MARANHÃO</option>
<option value="51">MATO GROSSO</option>
<option value="50">MATO GROSSO DO SUL</option>
<option value="31">MINAS GERAIS</option>
<option value="15">PARÁ</option>
<option value="25">PARAÍBA</option>
<option value="41">PARANÁ</option>
<option value="26">PERNAMBUCO</option>
<option value="22">PIAUÍ</option>
<option value="33">RIO DE JANEIRO</option>
<option value="24">RIO GRANDE DO NORTE</option>
<option value="43">RIO GRANDE DO SUL</option>
<option value="11">RONDÔNIA</option>
<option value="14">RORAIMA</option>
<option value="42">SANTA CATARINA</option>
<option value="35">SÃO PAULO</option>
<option value="28">SERGIPE</option>
<option value="17">TOCANTINS</option>
</select>
<select name="pkCodMunicipio" id="pkCodMunicipio" tabindex="17">
<option value="">Municípios</option>
<option value="1400050">ALTO ALEGRE</option>
<option value="1400027">AMAJARI</option>
<option value="1400100">BOA VISTA</option>
<option value="1400159">BONFIM</option>
<option value="1400175">CANTA</option>
<option value="1400209">CARACARAI</option>
<option value="1400233">CAROEBE</option>
<option value="1400282">IRACEMA</option>
<option value="1400308">MUCAJAI</option>
<option value="1400407">NORMANDIA</option>
<option value="1400456">PACARAIMA</option>
<option value="1400472">RORAINOPOLIS</option>
<option value="1400506">SAO JOAO DA BALIZA</option>
<option value="1400605">SAO LUIZ</option>
<option value="1400704">UIRAMUTA</option>
</select>
...
<button name="btnSearch" class="btnDefault btn btn-warning" title="Buscar" type="submit" id="btnSearch" onclick="void(0);">Buscar</button>
</div>
<input type="hidden" name="undefined" value="undefined">
</form>

You can use a css selector combination.您可以使用 css 选择器组合。 The below use an id ( # ) selector to target the parent select tag element in descendant combination with option element selector to get all the child option tag elements.下面使用 id ( # ) 选择器将父select标记元素与option元素选择器结合使用,以获取所有子option标记元素。

Dim nodeList As Object, i As Long
Set nodeList = HTMLDoc.querySelectorAll("#pkCodEstado option")
For i = 0 To nodeList.Length-1
    Debug.Print nodeList.item(i).innerText
Next

i is already declared at the top so you don't actually need to declare again. i已经在顶部声明,因此您实际上不需要再次声明。 You should use Option Explicit at the top off all modules and thus declare all your variables.您应该在所有模块的顶部使用Option Explicit ,从而声明所有变量。 You have a number of undeclared variables in your code.您的代码中有许多未声明的变量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM