简体   繁体   English

RDFlib查询不起作用

[英]RDFlib query not working

I wrote a Python script that should be able to run through a list of dbpedia URI's and run a query on them. 我编写了一个Python脚本,该脚本应该能够运行dbpedia URI的列表并对其进行查询。 However, for some reason I get an error on 但是,由于某种原因,我在

qres = g.query(query) 

when I run this code. 当我运行这段代码时。 Does anyone know why this happens and how I could fix this? 有谁知道为什么会这样,我该如何解决? I'm really stuck and I'm getting behind on my thesis timeline so the stress is really building. 我真的被困住了,而且我在论文的时间表上落后了,所以压力真的越来越大了。

Code: 码:

import rdflib
import csv
import pandas as pd

colnames = ['Link']

list2 = pd.read_csv('C:/Users/Frank/Google Drive/Master Scriptie/testtest3.csv', sep=',', header=None, usecols=[2], names=colnames)
saved_column = list2.Link 
outputfile = open('C:/Users/Frank/Google Drive/Master Scriptie/code files/dbpedia_output/test_dataset_uri_subject.csv', 'w')

reader = csv.reader(saved_column)

g = rdflib.Graph()
for uri in reader:
    uri2 = "".join(str(x) for x in uri)
    uri2 = uri2[1:].rstrip()
    print (uri2)
    result = g.parse("http://dbpedia.org" + uri2)
    print (result)
    query = "SELECT ?subject WHERE {<http://dbpedia.org" + uri2 + "> dbo:wikiPageRedirects*/dct:subject ?subject .}"
    print ("query: " + query)
    qres = g.query(query)
    for singlerow in qres:
        subject_final = "%s" % singlerow
        outputfile.write("{0}, {1} \n".format(uri,subject_final)

Error message in cmd: cmd中的错误消息:

/resource/Sheldon_J._Plankton
[a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'IOMemory']].
query: SELECT ?subject WHERE {<http://dbpedia.org/resource/Sheldon_J._Plankton>
dbo:wikiPageRedirects*/dct:subject ?subject .}
Traceback (most recent call last):
  File "rdfimport.py", line 47, in <module>
    qres = g.query(query)
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\graph.py", line 1089, in query
    query_object, initBindings, initNs, **kwargs))
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\processor.py", line 75, in query
    query = translateQuery(parsetree, base, initNs)
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 764, in translateQuery
    q[1], visitPost=functools.partial(translatePName, prologue=prologue))
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 384, in traverse
    r = _traverse(tree, visitPre, visitPost)
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
    e[k] = _traverse(val, visitPre, visitPost)
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
    e[k] = _traverse(val, visitPre, visitPost)
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
    return [_traverse(x, visitPre, visitPost) for x in e]
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
    return [_traverse(x, visitPre, visitPost) for x in e]
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
    e[k] = _traverse(val, visitPre, visitPost)
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
    return [_traverse(x, visitPre, visitPost) for x in e]
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
    return [_traverse(x, visitPre, visitPost) for x in e]
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
    return [_traverse(x, visitPre, visitPost) for x in e]
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
    return [_traverse(x, visitPre, visitPost) for x in e]
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
    e[k] = _traverse(val, visitPre, visitPost)
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
    return [_traverse(x, visitPre, visitPost) for x in e]
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
    return [_traverse(x, visitPre, visitPost) for x in e]
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
    e[k] = _traverse(val, visitPre, visitPost)
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
    return [_traverse(x, visitPre, visitPost) for x in e]
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
    return [_traverse(x, visitPre, visitPost) for x in e]
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
    e[k] = _traverse(val, visitPre, visitPost)
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 347, in _traverse
    _e = visitPost(e)
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 142, in translatePName
    return prologue.absolutize(p)
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\sparql.py", line 374, in absolutize
    return self.resolvePName(iri.prefix, iri.localname)
  File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\sparql.py", line 357, in resolvePName
    raise Exception('Unknown namespace prefix : %s' % prefix)
Exception: Unknown namespace prefix : dct

Thanks in advance :) 提前致谢 :)

EDIT: 编辑:

I believe something goes wrong in 我相信这出了问题

result = g.parse("http://dbpedia.org" + uri2)

The URI it attempts to parse there in this example is " http://dbpedia.org/resource/Sheldon_J._Plankton " 在此示例中,它尝试解析的URI是“ http://dbpedia.org/resource/Sheldon_J._Plankton

which also gives an error if I directly put that URI in g.parse. 如果我直接将该URI放在g.parse中,也会产生错误。 Might this be because that URI is "wrong", since it redirects to 可能是因为URI“错误”,因为它重定向到

" http://dbpedia.org/resource/Plankton_(character) ". http://dbpedia.org/resource/Plankton_(character) ”。

I fixed this in my query with dbo:wikiPageRedirects, but that's after this parse of course. 我使用dbo:wikiPageRedirects在查询中修复了此问题,但这当然是在解析之后。 So the problem lies there I think, but how could I get the right page using dbo:wikiPageRedirects if I can't parse it first to get that page?? 所以问题就在我想的那儿,但是如果我不能先解析它就可以使用dbo:wikiPageRedirects获得正确的页面呢?

The error message is complaining about not recognising the prefix dct , RDFLib has dcterms built in or you can bind your own prefixes: 错误消息抱怨无法识别前缀dct ,RDFLib内置了dcterms ,或者您可以绑定自己的前缀:

from rdflib.namespace import DCTERMS, Namespace
g.bind("dct", DCTerms)
g.bind("dbo", Namespace("http://dbpedia.org/ontology/"))
g.bind("dbr", Namespace("http://dbpedia.org/resource/"))

Assuming uri2 is a dbpedia resource and only contains the final part of the URI (ie "Sheldon_J._Plankton"), then the SPARQL query to get the redirect page becomes: 假设uri2是dbpedia资源,并且仅包含URI的最后一部分(即“ Sheldon_J._Plankton”),那么用于获取重定向页面的SPARQL查询将变为:

q = "SELECT ?subject WHERE {{ dbr:{} dbo:wikiPageRedirects ?subject. }}".format
result = g.query(q(uri2))
for row in result:
    print(row.subject)

To get the subject of the redirect, if it is in your data, this query should work. 要获取重定向的主题,如果它在您的数据中,则此查询应该起作用。 But you might need to run g.parse over the URIs returned in the previous query to add it to your data: 但是您可能需要对上一个查询返回的URI运行g.parse才能将其添加到数据中:

q = "SELECT ?subject WHERE {{ dbr:{} dbo:wikiPageRedirects ?redirect. ?redirect dct:subject ?subject. }}".format
result = q.query(q(uri2))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM