简体   繁体   中英

Extracting information from DBpedia

I am working on a project and I want to make use of DBpedia. I have few hundreds of DBpedia links like

What is better use of time:

  • to crawl those pages and extract the information that I want?
  • to query the data with a SPARQL query from Python?

First, note that the URI identifying the DBpedia resource is not

with page , but

with resource . Second, it will be much faster to retrieve information using SPARQL. SPARQL is a query language for RDF, and you're looking to get RDF data. All you need to do in SPARQL to get information about FEMA is a describe query:

describe
  dbpedia:Federal_Emergency_Management_Agency

SPARQL results

Describe queries can describe multiple resources, so you can do, for instance:

describe
  dbpedia:Federal_Emergency_Management_Agency
  dbpedia:Mount_Monadnock
  # more resources...

SPARQL results

If you only want certain information about some resources, you can still do something similar with select or construct queries, using values and programmatically injecting the resources you're interested in:

select ?label where { 
  values ?resource {
    dbpedia:Federal_Emergency_Management_Agency # put your values in here and
    dbpedia:Mount_Monadnock                     # ?resource will be bound to each
  }
  ?resource rdfs:label ?label .
  filter( langMatches( lang(?label), "EN" ))
}

SPARQL results

You can also use a construct to get those triples in a model:

construct {
  ?resource rdfs:label ?label 
}
where { 
  values ?resource {
    dbpedia:Federal_Emergency_Management_Agency
    dbpedia:Mount_Monadnock
  }
  ?resource rdfs:label ?label .
  filter( langMatches( lang(?label), "EN" ))
}

SPARQL results

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM