簡體   English   中英

如何從字符串中提取值? (迅速)

[英]How to extract values from string? (swift)

我正在嘗試分析字符串並將其分解為明確的值: taskValuetimeValue

var str1 = "20 minutes (to do)/(for) some kind of task"
// other possibilities
var str2 = "1 hour 30 minutes for some kind of task"
var str3 = "do some kind of task for 1 hour"

如何在一個 function 中應用多個正則表達式? 也許,像正則表達式數組

["[0-9]{1,} minutes", 
 "[0-9] hour", 
 "[0-9] hour, [0-9]{1,} minutes",
  ...]

從 function 返回的值不干凈,它仍然是"of..", "for...", "to..."等。

你能給我建議如何改進它嗎? 也許可以用 MLKit 做一些機器學習? 如何添加幾個正則表達式模式? 或者手動檢查字符串是否包含某些內容?

// check it out
var str = "20 minutes to do some kind of task"
func decompose(_ inputText: String) -> (time: String, taskName: String) {
    
    let pattern = "[0-9]{1,} minutes"
    let regexOptions: NSRegularExpression.Options = [.caseInsensitive]
    let matchingOptions: NSRegularExpression.MatchingOptions = [.reportCompletion]
    let range = NSRange(location: 0, length: inputText.utf8.count)
    
    var time = ""
    var taskName = inputText
    
    let regex = try! NSRegularExpression(pattern: pattern, options: regexOptions)
    if let matchIndex = regex.firstMatch(in: inputText, options: matchingOptions, range: range) {
        
        let startIndex = inputText.index(inputText.startIndex, offsetBy: matchIndex.range.lowerBound)
        let endIndex = inputText.index(inputText.startIndex, offsetBy: matchIndex.range.upperBound)
        
        time = String(inputText[startIndex..<endIndex])

        taskName.removeSubrange(startIndex..<endIndex)
           
    } else {
        print("No match.")
    }


    return (time, taskName)
}

print(decompose(str))

總的來說,我希望在我們事先了解主題的前提下學習如何進行文本分析。

使用捕獲組:

(\d+)\s*minute|(\d+)\s*hour

請參閱正則表達式證明 然后檢查哪個組匹配並根據需要使用捕獲的值。 如果第一組匹配,你有分鍾,否則,你在第二組有幾個小時。

解釋

--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  minute                   'minute'
--------------------------------------------------------------------------------
 |                        OR
--------------------------------------------------------------------------------
  (                        group and capture to \2:
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \2
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  hour                     'hour'

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM