[英]JSoup not able to get links from html
我正在尝试从站点的 html 获取链接,但无法使用 Jsoup 这样做。
这是 HTML:
<div class="anime_muti_link">
<ul>
<li><div class="doamin">Domain</div><div class="link">Link</div></li>
<li class="anime">
<a href="#" class="active" rel="1" data-video="example.com" ><div class="server m1">Server m1</div><span>Watch This Link</span></a>
</li>
<li class="anime">
<a href="#" rel="1" data-video="example.com" ><div class="server m1">Server m2</div><span>Watch This Link</span></a>
</li>
<li class="xstreamcdn">
<a href="#" rel="29" data-video="example.com">Xstreamcdn</div><span>Watch This Link</span></a>
</li>
<li class="mixdrop">
<a href="#" rel="7" data-video="example.com"><div class="server mixdrop">Mixdrop</div><span>Watch This Link</span></a>
</li>
<li class="streamsb">
<a href="#" rel="13" data-video="example.com">StreamSB</div><span>Watch This Link</span></a>
</li>
<li class="doodstream">
<a href="#" rel="14" data-video="example.com">Doodstream</div><span>Watch This Link</span></a>
</li>
</ul>
</div>
这是我编写的 android 代码,它似乎不起作用:
try {
Document doc = Jsoup.connect(URL).get();
Elements content = doc.getElementsByClass("anime_muti_link");
Elements links = content.select("a");
String[] urls = new String[links.size()];
for (int i = 0; i < links.size(); i++) {
urls[i] = links.get(i).attr("data-video");
if (!urls[i].startsWith("https://")) {
urls[i] = "https:" + urls[i];
}
}
arrayList.addAll(Arrays.asList(urls));
Log.d("CALLING_URL", "Links: " + Arrays.toString(urls));
} catch (IOException e) {
e.getMessage();
}
有人可以帮我吗? 谢谢
编辑:基本上我正在尝试获取这 6 个链接并将它们添加到我的列表中以在应用程序中使用它。
如您所见,在此li
定义中,您包含一个嵌套的div
:
<li class="xstreamcdn">
<a href="#" rel="29" data-video="example.com">Xstreamcdn</div><span>Watch This Link</span></a>
</li>
这导致变量内容 HTML 片段与 class anime_muti_link
看起来像:
<div class="anime_muti_link">
<ul>
<li>
<div class="doamin">
Domain
</div>
<div class="link">
Link
</div></li>
<li class="anime"> <a href="#" class="active" rel="1" data-video="example.com">
<div class="server m1">
Server m1
</div><span>Watch This Link</span></a> </li>
<li class="anime"> <a href="#" rel="1" data-video="example.com">
<div class="server m1">
Server m2
</div><span>Watch This Link</span></a> </li>
<li class="xstreamcdn"> <a href="#" rel="29" data-video="example.com">Xstreamcdn</a></li>
</ul>
</div>
这就是为什么你只找到三个锚点。
请尝试更正您的 HTML 或选择锚标记作为文档级别:
Document document = Jsoup.parse(html);
Elements content = document.getElementsByClass("anime_muti_link");
// System.out.println(content);
Elements links = document.select("a");
String[] urls = new String[links.size()];
for (int i = 0; i < links.size(); i++) {
urls[i] = links.get(i).attr("data-video");
if (!urls[i].startsWith("https://")) {
urls[i] = "https://" + urls[i];
}
}
System.out.println(Arrays.asList(urls));
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.