简体   繁体   中英

Extract Data from BNode

I'm using a SPARQL to extract a node from an RDF file, the node in the rdf file is as follows:

 <dc:description>Birds are a class of vertebrates. They are bipedal, warm-blooded, have a covering of feathers, and their front limbs are modified into wings. Some birds, such as penguins and ostriches, have lost the power of flight. All birds lay eggs. Because birds are warm-blooded, their eggs have to be incubated to keep the embryos inside warm, or they will perish.^M
    <br />
    <br />
    <a href="/nature/19700707">All you need to know about British birds.</a>
</dc:description>

I'm using python RDFLib to get this node. It is returning as

rdflib.term.BNode('Nfc3f01b2567a4b3ea23dbd01394929df')

How is it possible to extract the text from dc:description rdflib.term.BNode('Nfc3f01b2567a4b3ea23dbd01394929df')

Something I tried based on the answers:

from rdflib import *
import rdfextras
import json

#load the ontology
rdfextras.registerplugins()
g=Graph()

g.parse("http://www.bbc.co.uk/nature/life/Bird.rdf")


#define the predixes
PREFIX = ''' PREFIX dc:<http://purl.org/dc/terms/>
             .......
             PREFIX po:<http://purl.org/ontology/po/>
             PREFIX owl:<http://www.w3.org/2002/07/owl#>
         '''

def exe(query):
        query = PREFIX + query
        return g.query(query)

def getEntity(entity_type,entity):
        #getting the description
        entity_url = "<http://www.bbc.co.uk/nature/life/" + entity.capitalize() + "#" + entity_type.lower() +">"
    query = ''' SELECT ?description
                    WHERE { ''' + entity_url + ''' dc:description ?description . }'''
    result_set = exe(query)
    dc = Namespace("http://purl.org/dc/terms/")
        for row in result_set:
                description = row[0]
            print description.value(dc.description)

getEntity("class","bird")

I'm getting the following error:

Traceback (most recent call last):
  File "test_bird1.py", line 40, in <module>
    getEntity("class","bird")
  File "test_bird1.py", line 38, in getEntity
    print description.value(dc.description)
AttributeError: 'BNode' object has no attribute 'value'

BNodes (and URIrefs, too) are resources, so the resource module documentation is probably the most useful documentation for you here. Based on that documentation, it looks like something like this should take care of things for you. Where x is the blank node, and g is the graph, it would look like this:

>>> from rdflib import *
>>> DC = Namespace("http://purl.org/dc/terms/")
>>> r = Resource( g, x )
>>> r.value(DC.description)

As pointed out in this answer to your other question, SPARQL not returning correct result , it's not actually legal to have those <br /> where they're appearing (perhaps you'll need to work with the another serialization, eg, NTriples, N3, Turtle), so it's hard to predict what different libraries will do with the malformed input. You might let the content producer know that they're publishing ill-formed data.

from rdflib import Graph, BNode
g = Graph()
g.parse("http://www.bbc.co.uk/nature/life/Bird.rdf")

for objects in g.objects(subject=BNode(add the BNode code here)):
   print (objects)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM