繁体   English   中英

在 Swift 中,如何从较大的字符串生成子字符串数组?

[英]In Swift, how can I generate an array of substrings from a larger string?

我有一个 HTML 字符串,我试图生成一个数组,其中包含发生在两组字符之间的所有 substring 实例。

我的字符串看起来像这样:

<h2>The Phantom Menace</h2>
<p>Two Jedi escape a hostile blockade to find allies and come across a young boy who may bring balance to the Force, but the long dormant Sith resurface to claim their original glory.</p>
<h2>Attack of the Clones</h2>
<p>Ten years after initially meeting, Anakin Skywalker shares a forbidden romance with Padmé Amidala, while Obi-Wan Kenobi investigates an assassination attempt on the senator and discovers a secret clone army crafted for the Jedi.</p>
<h2>Revenge of the Sith</h2>
<p>Three years into the Clone Wars, the Jedi rescue Palpatine from Count Dooku. As Obi-Wan pursues a new threat, Anakin acts as a double agent between the Jedi Council and Palpatine and is lured into a sinister plan to rule the galaxy.</p>
<h2>A New Hope</h2>
<p>Luke Skywalker joins forces with a Jedi Knight, a cocky pilot, a Wookiee and two droids to save the galaxy from the Empire's world-destroying battle station, while also attempting to rescue Princess Leia from the mysterious Darth Vader.</p>
<h2>The Empire Strikes Back</h2>
<p>After the Rebels are brutally overpowered by the Empire on the ice planet Hoth, Luke Skywalker begins Jedi training with Yoda, while his friends are pursued by Darth Vader and a bounty hunter named Boba Fett all over the galaxy.</p>
<h2>Return of the Jedi</h2>
<p>After a daring mission to rescue Han Solo from Jabba the Hutt, the Rebels dispatch to Endor to destroy the second Death Star. Meanwhile, Luke struggles to help Darth Vader back from the dark side without falling into the Emperor's trap.</p>
<h2>The Force Awakens</h2>
<p>As a new threat to the galaxy rises, Rey, a desert scavenger, and Finn, an ex-stormtrooper, must join Han Solo and Chewbacca to search for the one hope of restoring peace.</p>
<h2>The Last Jedi</h2>
<p>Rey develops her newly discovered abilities with the guidance of Luke Skywalker, who is unsettled by the strength of her powers. Meanwhile, the Resistance prepares for battle with the First Order.</p>
<h2>The Rise of Skywalker</h2>
<p>The surviving members of the resistance face the First Order once again, and the legendary conflict between the Jedi and the Sith reaches its peak bringing the Skywalker saga to its end.</p>

我想创建一个 {h2} 和 {/h2} 子字符串数组以获得以下结果:

【《幽灵的威胁》、《克隆人的进攻》、《西斯的复仇》、《新希望》、《帝国反击战》、《绝地归来》、《原力觉醒》、《最后绝地”、“天行者的崛起”]

此代码是否有变体,我可以在其中输入标签之间的范围?

let titles = htmlInput.components(separatedBy:"<h2>")

这将返回一个包含如下元素的数组:

“魅影危机

两名绝地武士逃离敌对封锁寻找盟友,并遇到一个可能为原力带来平衡的小男孩,但长期沉睡的西斯重新出现,夺取了他们最初的荣耀。

"

欢迎任何帮助。

谢谢

正如评论中提到的,在这里使用XMLParser将是一个好主意。 定义您的XMLParser ,并设置它的委托( XMLParserDelegate ),它是您定义的 class (继承自XMLParserDelegate 。):您需要两个函数:

public func parser(_ parser: XMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String : String]) {
    lastTag = elementName
}

/// When there is text found between tags, add it to the array.
public func parser(_ parser: XMLParser, foundCharacters string: String) {
    let text = string.trimmingCharacters(in: CharacterSet.whitespacesAndNewlines)
    if !text.isEmpty && lastTag == "h2" {
        h2Array.append(text)
    }
}

最后,你需要一个 getter 来让 h2Array 能够在你需要的地方使用它。 您需要两个私有变量( var lastTag: Stringvar h2Array: [String] )。

在这里你如何使用解析器:

let parser = XMLParser(data: htmlString.data(using: .utf8) ?? Data())
let parserDelegate = MyParserDelegate()
parser.delegate = parserDelegate
parser.parse()
let h2Array = parserDelegate.getterForTheArray() // this one needs to be defined

对于func parser(_ parser: XMLParser, foundCharacters string: String)您还应该从其文档中考虑到这一点:

解析器 object 可能会向委托发送多个解析器(_:foundCharacters:) 消息来报告元素的字符。 因为字符串可能只是当前元素总字符内容的一部分,你应该append 将它到当前的字符累积直到元素发生变化。

这意味着您可能需要更改我的解决方案,以确保不切断您的字符串,并且在您的数组中包含搜索字符串的两半而不是整个...

您可以使用正则表达式(?<=<h2>)(.*?)(?=</h2>)

例子:

let input: String = ...
let expr = "(?<=<h2>)(.*?)(?=</h2>)"

do {
    let regex = try NSRegularExpression(pattern: expr)
    let nsString = input as NSString
    let results = regex.matches(in: input, range: NSRange(location: 0, length: nsString.length))
    print(results.map { nsString.substring(with: $0.range)})
} catch let error {
    print("invalid regex: \(error.localizedDescription)")
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM