简体   繁体   English

如何在SPARQL中过滤DBpedia结果

[英]How to filter DBpedia results in SPARQL

I have a little problem... if I have this simple SPARQL query 我有一个小问题...如果我有这个简单的SPARQL查询

SELECT ?abstract 
WHERE {
<http://dbpedia.org/resource/Mitsubishi> <http://dbpedia.org/ontology/abstract> ?abstract.
FILTER langMatches( lang(?abstract), 'en')}

I have this result: SPARQL Result and it has a non-English character... is there any idea how to remove them and retrieve just English words? 我有这个结果: SPARQL结果 ,它具有非英语字符...是否有任何想法如何删除它们并仅检索英语单词?

You'll need to define exactly what characters you want and don't want in your result, but you can use replace to replace characters outside of a range with, eg, empty strings. 您需要准确定义结果中需要和不需要的字符,但是您可以使用replace将范围之外的字符替换为例如空字符串。 If you wanted to exclude all but the Basic Latin, Latin-1 Supplement, Latin Extended-A, and Latin Extended-B ranges, (which ends up being \–\ɏ) you could do the following: 如果您想排除所有基本拉丁,拉丁1补充,拉丁扩展A和拉丁扩展B范围(最终为\\ u0000– \\ u024f),则可以执行以下操作:

SELECT ?abstract ?cleanAbstract
WHERE {
  dbpedia:Mitsubishi dbpedia-owl:abstract ?abstract 
  FILTER langMatches( lang(?abstract), 'en')
  bind(replace(?abstract,"[^\\x{0000}-\\x{024f}]","") as ?cleanAbstract)
}

SPARQL results SPARQL结果

Or even simpler: 甚至更简单:

SELECT (replace(?abstract_,"[^\\x{0000}-\\x{024f}]","") as ?abstract)
WHERE {
  dbpedia:Mitsubishi dbpedia-owl:abstract ?abstract_
  FILTER langMatches(lang(?abstract_), 'en')
}

SPARQL results SPARQL结果

The Mitsubishi Group (, Mitsubishi Gurūpu) (also known as the Mitsubishi Group of Companies or Mitsubishi Companies) is a group of autonomous Japanese multinational companies covering a range of businesses which share the Mitsubishi brand, trademark, and legacy.The Mitsubishi group of companies form a loose entity, the Mitsubishi Keiretsu, which is often referenced in Japanese and US media and official reports; 三菱集团(MitsubishiGurūpu)(也称为三菱集团公司或Mitsubishi Companies)是一组由日本人自主经营的跨国公司,涵盖了拥有三菱品牌,商标和遗产的一系列业务。形成一个松散的实体,即三菱Keiretsu,在日本和美国媒体和官方报道中经常提及; in general these companies all descend from the zaibatsu of the same name. 通常,这些公司都来自同名的zaibatsu。 The top 25 companies are also members of the Mitsubishi Kin'yōkai, or "Friday Club", and meet monthly. 前25名公司也是三菱Kin'yōkai(“星期五俱乐部”)的成员,每月举行会议。 In addition the Mitsubishi.com Committee exists to facilitate communication and access of the Mitsubishi brand through a portal web site. 此外,还设有Mitsubishi.com委员会,以通过门户网站促进三菱品牌的交流和访问。

You may find the Latin script in Unicode Wikipedia article useful. 您可能会在Unicode Wikipedia文章中找到拉丁脚本

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM