简体   繁体   English

识别推文中提到的大学

[英]Identifying Universities mentioned in Tweet Text

I am looking for a means of identifying UK University names mentioned in Tweet text. 我正在寻找一种识别推文中提到的英国大学名称的方法。

I have a list of full University names, but the issue is dealing with shortened versions such as "aber uni" (Aberystwyth Uni), "staffs uni" (Staffordshire University) or "portsmouth" (University of Portsmouth). 我有完整的大学名称列表,但是问题是在处理诸如“ aber uni”(阿伯斯威斯大学),“ staffs uni”(斯塔福德郡大学)或“ portsmouth”(朴茨茅斯大学)的缩写。

I have looked down the route of Apache Stanbol and OpenNLP to attempt Named Entity Recognition, and although these will match for the full names I cannot seem to find a means of training them to identify variations of the names (or indeed lowercase versions of the name which are not identified). 我已经忽略了Apache Stanbol和OpenNLP尝试命名实体识别的途径,尽管这些将与全名匹配,但我似乎找不到找到训练它们以识别名称变体(或者实际上是小写版本)的方法。尚未确定)。

收集大学列表(这很容易做到),并从Freebase刮取每所大学的名称列表: 使用网络查找相关名称的一种方法是什么?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM