[英]How to extract values from string? (swift)
我正在嘗試分析字符串並將其分解為明確的值: taskValue
和timeValue
。
var str1 = "20 minutes (to do)/(for) some kind of task"
// other possibilities
var str2 = "1 hour 30 minutes for some kind of task"
var str3 = "do some kind of task for 1 hour"
如何在一個 function 中應用多個正則表達式? 也許,像正則表達式數組
["[0-9]{1,} minutes",
"[0-9] hour",
"[0-9] hour, [0-9]{1,} minutes",
...]
從 function 返回的值不干凈,它仍然是"of..", "for...", "to..."
等。
你能給我建議如何改進它嗎? 也許可以用 MLKit 做一些機器學習? 如何添加幾個正則表達式模式? 或者手動檢查字符串是否包含某些內容?
// check it out
var str = "20 minutes to do some kind of task"
func decompose(_ inputText: String) -> (time: String, taskName: String) {
let pattern = "[0-9]{1,} minutes"
let regexOptions: NSRegularExpression.Options = [.caseInsensitive]
let matchingOptions: NSRegularExpression.MatchingOptions = [.reportCompletion]
let range = NSRange(location: 0, length: inputText.utf8.count)
var time = ""
var taskName = inputText
let regex = try! NSRegularExpression(pattern: pattern, options: regexOptions)
if let matchIndex = regex.firstMatch(in: inputText, options: matchingOptions, range: range) {
let startIndex = inputText.index(inputText.startIndex, offsetBy: matchIndex.range.lowerBound)
let endIndex = inputText.index(inputText.startIndex, offsetBy: matchIndex.range.upperBound)
time = String(inputText[startIndex..<endIndex])
taskName.removeSubrange(startIndex..<endIndex)
} else {
print("No match.")
}
return (time, taskName)
}
print(decompose(str))
總的來說,我希望在我們事先了解主題的前提下學習如何進行文本分析。
使用捕獲組:
(\d+)\s*minute|(\d+)\s*hour
請參閱正則表達式證明。 然后檢查哪個組匹配並根據需要使用捕獲的值。 如果第一組匹配,你有分鍾,否則,你在第二組有幾個小時。
解釋
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
minute 'minute'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
hour 'hour'
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.