简体   繁体   English

使用拆分将字符串转换为arraylist

[英]Convert String to arraylist using split

Is it possible to convert below String content to an arraylist using split , so that you get something like in point A? 是否可以使用split将String下方的内容转换为arraylist,以便获得类似于A点的内容?

<a class="postlink" href="http://test.site/i7xt1.htm">http://test.site/i7xt1.htm<br/>
</a>
<br/>Mirror:<br/>
<a class="postlink" href="http://information.com/qokp076wulpw">http://information.com/qokp076wulpw<br/>
</a>
<br/>Additional:<br/>
<a class="postlink" href="http://additional.com/qokdsfsdwulpw">http://additional.com/qokdsfsdwulpw<br/>
</a>

Point A (desired arraylist content): 点A (所需的数组列表内容):

http://test.site/i7xt1.htm
Mirror:
http://information.com/qokp076wulpw
Additional:
http://additional.com/qokdsfsdwulpw

I am now using below code but it doesn`t bring the desired output. 我现在正在使用下面的代码,但没有带来所需的输出。 (mirror for instance is being added multiple times etc). (例如,镜像被多次添加等)。

Document doc = Jsoup.parse(string);
Elements links = doc.select("a[href]");
for (Element link : links) {
    Node previousSibling = link.previousSibling();

    while (!(previousSibling.nodeName().equals("u") || previousSibling.nodeName().equals("#text"))) {
        previousSibling = previousSibling.previousSibling();
    }

    String identifier = previousSibling.toString();

    if (identifier.contains("Mirror")) {
        totalUrls.add("MIRROR(s):");
    }
    totalUrls.add(link.attr("href"));
}

Fix your links first. 首先修复您的链接。 As cricket_007 mentioned, having proper HTML would make this a lot easier. 正如cricket_007所提到的,拥有适当的HTML将使此操作变得容易得多。

String html = yourHtml.replaceAll("<br/></a>", "</a>"); // get rid of bad HTML
String[] lines = html.split("<br/>");

for (String str : Arrays.asList(lines)) {
    Jsoup.parse(str).text();
    ... // you can go further here, check if it has a link or not to display your semi-colon;
}

Now that the errant <br> tags are out of the links, you can split the string on the <br> tags that remain and print out your html result. 现在,错误的<br>标记不在链接中,您可以在保留的<br>标记上拆分字符串,并打印出html结果。 It's not pretty, but it should work. 它不是很漂亮,但应该可以。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM