[英]PHP DOMDocument Scraping nested class
I've HTML document like this I want to scrape all categories and subcategory, I've try to scrape with DOMXpath and load the document html then filter class "css-1qaqbbz" (but only get the categories), which my expected array like這個
[
'Arsitektur & Desain' => [
“布庫班古南”,
《布庫規范與標准》
//ETC...
],
]
<div>
<a class="css-1qaqbbz">Arsitektur & Desain
</a>
</div>
<div class="css-1wode1h">
<a data-testid="categoryNavigation#1" class="css-1nykm5o">Buku Bangunan</a>
<a data-testid="categoryNavigation#2" class="css-1nykm5o">Buku Codes & Standars</a>
<a data-testid="categoryNavigation#3" class="css-1nykm5o">Buku Dekorasi & Ornamen</a>
<a data-testid="categoryNavigation#4" class="css-1nykm5o">Buku Desain Dapur</a>
<a data-testid="categoryNavigation#5" class="css-1nykm5o">Buku Desain Kamar</a>
<a data-testid="categoryNavigation#6" class="css-1nykm5o">Buku Desain Ruang Keluarga</a>
<a data-testid="categoryNavigation#7" class="css-1nykm5o">Buku Desain Ruang Tamu</a>
<a data-testid="categoryNavigation#8" class="css-1nykm5o">Buku Desain Rumah</a>
<a data-testid="categoryNavigation#9" class="css-1nykm5o">Buku Interior & Eksterior</a>
<a data-testid="categoryNavigation#10" class="css-1nykm5o">Buku Metode & Material Bangunan</a>
<a data-testid="categoryNavigation#11" class="css-1nykm5o">Buku Taman</a>
</div>
<div class="css-1owj1eu" data-testid="catNavigation#2">
<div>
<a class="css-1qaqbbz">Buku Hukum</a>
</div>
<div class="css-1wode1h">
<a data-testid="categoryNavigation#1" class="css-1nykm5o">Buku Gender & Hukum</a>
<a data-testid="categoryNavigation#2" class="css-1nykm5o">Buku Hukum Dagang</a>
<a data-testid="categoryNavigation#3" class="css-1nykm5o">Buku Hukum Internasional</a>
<a data-testid="categoryNavigation#4" class="css-1nykm5o">Buku Hukum Perdata</a>
<a data-testid="categoryNavigation#5" class="css-1nykm5o">Buku Hukum Pidana</a>
<a data-testid="categoryNavigation#6" class="css-1nykm5o">Buku Kemanusiaan</a>
<a data-testid="categoryNavigation#7" class="css-1nykm5o">Buku Politik & Hukum</a>
<a data-testid="categoryNavigation#8" class="css-1nykm5o">Kumpulan Peraturan Perundang-Undangan</a>
<a data-testid="categoryNavigation#9" class="css-1nykm5o">UUD 1945</a></div>
</div>
這是我要抓取的源代碼
$dom = new \DOMDocument;
$dom->loadHTML($f);
$xpath = new DOMXPath($dom);
$results = $xpath->query("//*[@class='css-1qaqbbz']");
if ($results->length > 0) {
echo "<pre>";
$arrCats = []
foreach ($results as $key => $value) {
$arrCats[] = $value->nodeValue;
}
// die;
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.