简体   繁体   English

如何使用SPARQL在dbpedia中搜索与给定术语部分匹配的rdfs:label?

[英]How to search for rdfs:labels in dbpedia which are partial matches to a given term using SPARQL?

I am using this query to search for all labels that contains the word "Medi" 我正在使用此查询来搜索所有包含单词“ Medi”的标签

select distinct ?label where 
{ 
    ?concept rdfs:label  ?label 
    filter contains(?label,"Medi") 
    filter(langMatches(lang(?label),"en")) 
}

However, as soon as I change the term from "Medi" to "Medicare" it doesn't work and times out. 但是,一旦我将术语从“ Medi”更改为“ Medicare”,它就不起作用并且超时。 How do I get it to work with longer words like Medicare ie extract all labels which has the word Medicare in it. 如何使它与诸如Medicare之类的较长单词配合使用,即提取其中包含Medicare单词的所有标签。

Your query has to iterate over all labels in DBpedia - which is quite a large number - and then apply String containment check. 您的查询必须遍历DBpedia中的所有标签-这是一个很大的数目-然后应用字符串包含检查。 This is indeed expensive. 这确实是昂贵的。

Even a count query leads to an "estimated timeout error": 甚至计数查询也会导致“估计超时错误”:

select count(?label) where 
{ 
    ?concept rdfs:label  ?label 
    filter(regex(str(?label),"Medi")) 
    filter(langMatches(lang(?label),"en")) 
}

Two options: 两种选择:

  1. Virtuoso has some fulltext search support: Virtuoso具有一些全文本搜索支持:

     SELECT DISTINCT ?label WHERE { ?concept rdfs:label ?label . ?label bif:contains "Medicare" FILTER(langMatches(lang(?label),"en")) } 
  2. Since the public DBpedia endpoint is a shared endpoint, the solution is to load the DBpedia dataset into your own triple store, eg Virtuoso. 由于公共DBpedia端点是共享端点,因此解决方案是将DBpedia数据集加载到您自己的三元存储中,例如Virtuoso。 There you can adjust the max. 在那里您可以调整最大值。 estimated execution timeout parameter. 估计的执行超时参数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM