繁体   English   中英

如何通过Swift3将HTML字符串分成数组或字典?

[英]How to separate HTML string into array or dictionary by Swift3?

我从API获得HTML字符串是这样的:

let a: String = "<a href="https://www.google.com.tw">https://www.google.com.tw </a>"
let b: String = "<a href="myAppName://app/user/aa3b77411825b88b318d77gg">@Tim </a>Hello Tim"
let c: String = "<a href="myAppName://app/user/aa3b77411825b88b318d77gg">@Tim </a><a href="https://www.google.com.tw">https://www.google.com.tw </a>"

let splitedArray1: [String] = a.componentsSeparatedByString("?????") //splited string which is the best 
let splitedArray2: [String] = b.componentsSeparatedByString("?????") //splited string which is the best
let splitedArray3: [String] = c.componentsSeparatedByString("?????") //splited string which is the best

我想从他们分离链接,并获得如下数据

print(splitedArray1) //["https://www.google.com.tw","https://www.google.com.tw"]
print(splitedArray2) //["myAppName://app/user/aa3b77411825b88b318d77gg","@Tim ","Hello Tim"]
print(splitedArray3) //["myAppName://app/user/aa3b77411825b88b318d77gg","@Tim ","https://www.google.com.tw","https://www.google.com.tw "]

可能的解决方案:使用NSAttributedString然后在NSLinkAttributeName上枚举,如果没有,则意味着没有链接标记,因此您只需保留“字符串”,否则,添加链接,然后添加字符串。

快速在Playground中编写:

let a: String = "<a href=\"https://www.google.com.tw\">https://www.google.com.tw </a>"
let b: String = "<a href=\"myAppName://app/user/aa3b77411825b88b318d77gg\">@Tim </a>Hello Tim"
let c: String = "<a href=\"myAppName://app/user/aa3b77411825b88b318d77gg\">@Tim </a><a href=\"https://www.google.com.tw\">https://www.google.com.tw </a>"

let values:[String] = [a, b, c]



for aHTMLString in values
{
    let attributedString = try! NSAttributedString.init(data: aHTMLString.data(using: .utf8)!,
                                                        options: [.documentType: NSAttributedString.DocumentType.html],
                                                        documentAttributes: nil)
    var retValues = [String]()
    attributedString.enumerateAttribute(.link,
                                        in: NSRange(location: 0, length: attributedString.string.count),
                                        options: [],
                                        using: { (attribute, range, pointerStop) in
                                            if let attribute = attribute as? URL
                                            {
                                                retValues.append(attribute.absoluteString)
                                            }
                                            let subString = (attributedString.string as NSString).substring(with: range)
                                            retValues.append(subString)
    })

    print("*** retValues: \(retValues)")
}

let targetResult1 = ["https://www.google.com.tw","https://www.google.com.tw"]
let targetResult2 = ["myAppName://app/user/aa3b77411825b88b318d77gg","@Tim ","Hello Tim"]
let targetResult3 = ["myAppName://app/user/aa3b77411825b88b318d77gg","@Tim ","https://www.google.com.tw","https://www.google.com.tw "]
print("targetResult1: \(targetResult1)")
print("targetResult2: \(targetResult2)")
print("targetResult3: \(targetResult3)")

输出:

*** retValues: ["https://www.google.com.tw/", "https://www.google.com.tw "]
*** retValues: ["myappname://app/user/aa3b77411825b88b318d77gg", "@Tim ", "Hello Tim"]
*** retValues: ["myappname://app/user/aa3b77411825b88b318d77gg", "@Tim ", "https://www.google.com.tw/", "https://www.google.com.tw "]
targetResult1: ["https://www.google.com.tw", "https://www.google.com.tw"]
targetResult2: ["myAppName://app/user/aa3b77411825b88b318d77gg", "@Tim ", "Hello Tim"]
targetResult3: ["myAppName://app/user/aa3b77411825b88b318d77gg", "@Tim ", "https://www.google.com.tw", "https://www.google.com.tw "]

有一些细微的差别,我复制了您的“目标”(splitArray),并且在最后一个中缺少空格,并且我的代码倾向于在链接上添加最后的“ /”。

我已经创建了此扩展名以获取URL。

extension String {
  func getUrl() -> String? {
      let rss = self.split { (char) -> Bool in
          return char == ">"
      }
      if let final = rss.last?.split(separator: "<"), let first = final.first {
          return String(first)
      }
      return nil
  }

  var hrefUrl: String {
    let matchString = "=\""
    let arrComponents = self.components(separatedBy: matchString)
    if let first = arrComponents.last, let str = first.split(separator: "\"").first {

        return String(str)
    }
    return ""
  }
}

用法:

let a: String = "<a href=\"https://www.google.com.tw\">https://www.google.com.tw </a>"
a.getUrl()  //output: https://www.google.com.tw 

//or

a.hrefUrl //output: https://www.google.com.tw 

无需库的简单解决方案-只需使用String.replaceOccurences(of:...即可替换诸如href的奇数字符串,将其拆分为拆分参数(例如“ |”),然后使用componentsSeparatedByString(“ |”)来获取组件。

使用正则表达式提取URL。 下面我写了代码片段。

        let text = "<a href=\"https://www.google.com\">"

        let regex = try! NSRegularExpression(pattern: "<a[^>]+href=\"(.*?)\"[^>]*>")
        let range = NSMakeRange(0, text.characters.count)
        let matches = regex.matches(in: text, range: range)
        for match in matches {
            let strURL = (text as NSString).substring(with: match.rangeAt(1))
            print(strURL)
        }

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM