简体   繁体   English

将外部文本数据索引到 GraphDB 中的 lucene 索引

[英]Indexing external text data to lucene index in GraphDB

Is it possible to index external to RDF data?是否可以对 RDF 数据进行外部索引? Like in RDF there is a triple with the object as a link to an external file.就像在 RDF 中一样,有一个三元组,对象是指向外部文件的链接。 Can the content of this file be indexed instead of the link value?这个文件的内容可以被索引而不是链接值吗?

Absolutely.绝对地。 Lucene is a core part of GraphDB and it offers the standard functionality which comes with a standalone Lucene. Lucene 是 GraphDB 的核心部分,它提供了独立 Lucene 附带的标准功能。 The data will have to be parametrized as a String literal.数据必须被参数化为字符串文字。 <http://www.example.org/> rdfs:label "An example webpage url."@EN . Then you can configure a Lucene Index:然后你可以配置一个 Lucene 索引:

PREFIX luc: <http://www.ontotext.com/owlim/lucene#>
INSERT DATA {
  luc:index luc:setParam "uris" .
  luc:include luc:setParam "literals" .
  luc:moleculeSize luc:setParam "1" .
  luc:includePredicates luc:setParam "http://www.w3.org/2000/01/rdf-schema#label" .
}

And once you have the configuration, you can create the index.一旦你有了配置,你就可以创建索引。

PREFIX luc: <http://www.ontotext.com/owlim/lucene#>
INSERT DATA {
   luc:myTestIndex luc:createIndex "true" .
}

And, given the index and your data, you can query it.并且,给定索引和您的数据,您可以查询它。

PREFIX luc: <http://www.ontotext.com/owlim/lucene#>
SELECT * {
  ?subj luc:myTestIndex "web*"
}

Since you are asking about the subject of something which contains the string web*, you'll get <http://www.example.org/> .由于您询问的是包含字符串 web* 的主题,您将得到<http://www.example.org/> If you had other triples linking to this one, they might have also appeared.如果您有其他三元组链接到这个三元组,它们可能也会出现。

More information about the way in which GraphDB interacts with Lucene and its Full-Text-Search capabilities can be found within the GraphDB documentation.有关 GraphDB 与Lucene交互方式及其全文搜索功能的更多信息可以在 GraphDB 文档中找到。

I suspect that the answer above misunderstood the question.我怀疑上面的答案误解了这个问题。 The question refers to external content - ie, if GraphDB's Lucene is able to index the content available at http://example.org , rather than the RDF literal associated with it (and then return in searches the triple pointing to that content).该问题涉及外部内容 - 即,如果 GraphDB 的 Lucene 能够索引http://example.org 上可用的内容,而不是与其关联的 RDF 文字(然后在搜索中返回指向该内容的三元组)。

From what I was able to try no, this is not currently supported.从我能够尝试的情况来看,目前不支持。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM