简体   繁体   English

使用NSLinguisticTagger识别人名

[英]Identifying person names with NSLinguisticTagger

I am tinkering with the NSLinguisticTagger . 我正在修补NSLinguisticTagger

Identifying basic word types like noun, verb, prepositions works really well. 识别名词,动词,介词等基本单词类型非常有效。

However the recognition of person names NSLinguisticTagPersonalName hardly works in my tests (iOS8). 然而,人名NSLinguisticTagPersonalName的识别几乎不适用于我的测试(iOS8)。 Places NSLinguisticTagPlaceName also seem to work pretty well, however most of the times also person names are categorised as places. 地方NSLinguisticTagPlaceName似乎也运作良好,但大多数时候人名也被归类为地方。

Here's my basic setup (using NSLinguisticTagSchemeNameTypeOrLexicalClass) 这是我的基本设置(使用NSLinguisticTagSchemeNameTypeOrLexicalClass)

    var tagger:NSLinguisticTagger = NSLinguisticTagger(tagSchemes: NSLinguisticTagger.availableTagSchemesForLanguage("en") , options: 3)
    tagger.string = entryString
    tagger.enumerateTagsInRange(NSMakeRange(0, entryString.length), scheme: NSLinguisticTagSchemeNameTypeOrLexicalClass, options: (NSLinguisticTaggerOptions.OmitWhitespace | NSLinguisticTaggerOptions.JoinNames), usingBlock: {
        tag,tokenRange,sentenceRange,_ in
        let token = entryString.substringWithRange(tokenRange)
        println("[\(tag)] \(token) \(tokenRange)")

Example 1 例1

 "Meeting with John in Paris"

  Evaluation

 [Verb] Meeting
 [Preposition] with
 [Noun] John
 [Preposition] in
 [PlaceName] Paris

Example 2 例2

 "Meeting with John"

  Evaluation

 [Verb] Meeting (0,7)
 [Preposition] with (8,4)
 [PlaceName] John (13,4)

Any idea how I could improve the matching for person names? 知道如何改善人名的匹配吗?

Also I'd be interested to know how a Name would need to appear to be recognized. 此外,我有兴趣知道名称需要如何被识别。 (I assumed eg a preposition like "with" would be a good indicator … apparently this isn't enough). (我假设例如像“with”这样的介词将是一个很好的指标......显然这还不够)。 I'd appreciate any ideas or additional insights on this. 我很感激任何想法或其他见解。 It's an exciting field. 这是一个令人兴奋的领域。

Apparently the correct answer is: "wait a few years for Apple to improve NSLinguisticTagger in Swift 4 " 显然,正确答案是:“等待几年苹果改善Swift 4中的 NSLinguisticTagger

Here is the Swift 4 code written and executed in Xcode 9 (beta): 这是在Xcode 9(beta)中编写和执行的Swift 4代码:

let entryString = "Meeting with John"

let schemes = NSLinguisticTagger.availableTagSchemes(forLanguage: "en")
let options: NSLinguisticTagger.Options = [
    .omitWhitespace, .omitPunctuation, .joinNames
]

let tagger = NSLinguisticTagger(tagSchemes: schemes, options: Int(options.rawValue))
tagger.string = entryString

let rangeOfEntireEntryString = NSRange(location: 0, length: entryString.utf16.count)

tagger.enumerateTags(
    in: rangeOfEntireEntryString,
    scheme: .nameTypeOrLexicalClass,
    options: options)
{ (tag, tokenRange, sentenceRange, _) in
    guard let tag = tag?.rawValue else { return }
    let token = (entryString as NSString).substring(with: tokenRange)
    print("[\(tag)] \(token) \(tokenRange)")
}

and here are results with your first example string: 这是第一个示例字符串的结果:

let entryString = "Meeting with John in Paris"

[Noun] Meeting {0, 7}
[Preposition] with {8, 4}
[PersonalName] John {13, 4}
[Preposition] in {18, 2}
[PlaceName] Paris {21, 5}

and your second example string: 和你的第二个例子字符串:

let entryString = "Meeting with John"

[Noun] Meeting {0, 7}
[Preposition] with {8, 4}
[PersonalName] John {13, 4}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM