繁体   English   中英

提取给定节点的所有父代

[英]Extract all parents of a given node

我正在尝试使用EBI-RDF sparql端点提取每个给定GO ID(节点)的所有父 ,我是基于 两个类似的问题来制定查询的,下面是两个说明问题的示例:

示例1链接到结构 ):

biological_process (GO:0008150)
           |__ metabolic process (GO:0008152)
                           |__ methylation (GO:0032259)

在此示例中,使用以下查询:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dbpedia2: <http://dbpedia.org/property/>
PREFIX dbpedia: <http://dbpedia.org/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

PREFIX obo: <http://purl.obolibrary.org/obo/>

SELECT (count(?mid) as ?depth)
       (group_concat(distinct ?midId ; separator = " / ") AS ?treePath) 
FROM <http://rdf.ebi.ac.uk/dataset/go> 
WHERE {
    obo:GO_0032259 rdfs:subClassOf* ?mid .
    ?mid rdfs:subClassOf* ?class .
    ?mid <http://www.geneontology.org/formats/oboInOwl#id> ?midId.
}
GROUP BY ?treePath
ORDER BY ?depth

我得到了预期的结果,没有任何问题:

c |              treePath
--|-------------------------------------
6 | GO:0008150 / GO:0008152 / GO:0032259

但是,当该术语存在于多个分支(例如GO:0007267 )中时(如下例所示),以前的方法不起作用:

示例2链接到结构

biological_process (GO:0008150)
           |__ cellular_process (GO:0009987)
           |           |__ cell communication (GO:0007154)
           |                       |__ cell-cell signaling (GO:0007267)
           |
           |__ signaling (GO:0023052)
                      |__ cell-cell signaling (GO:0007267)

结果:

c |                            treePath
--|---------------------------------------------------------------
15| GO:0007154 / GO:0007267 / GO:0008150 / GO:0009987 / GO:0023052

我想要得到的是以下内容:

GO:0008150 / GO:0009987 / GO:0007154 / GO:0007267
GO:0008150 / GO:0023052 / GO:0007267

我了解的是,在幕后,我正在计算每个级别的深度并使用它来构造路径,当我们拥有一个仅属于一个分支的元素时,这种方法可以很好地工作。

SELECT (count(?mid) as ?depth) ?midId
FROM <http://rdf.ebi.ac.uk/dataset/go> 
WHERE {
    obo:GO_0032259 rdfs:subClassOf* ?mid .
    ?mid rdfs:subClassOf* ?class .
    ?mid <http://www.geneontology.org/formats/oboInOwl#id> ?midId.
}
GROUP BY ?midId
ORDER BY ?depth

结果:

depth |   midId
------|------------
1     | GO:0008150
2     | GO:0008152
3     | GO:0032259

在第二个示例中,一切都错了,我不知道为什么,无论如何,我可以确定问题的一部分是具有相同深度/级别的术语,但是我不知道如何解决这个问题。

depth |   midId
------|------------
2     | GO:0008150
2     | GO:0009987
2     | GO:0023052
3     | GO:0007154
6     | GO:0007267

感谢@AKSW,我找到了使用HyperGraphQL (用于在Web上查询和提供链接数据的GraphQL接口)的不错的解决方案。

我将在此处保留详细答案,这可能会对某人有所帮助。

  1. 我下载并设置了HyperGraphQL 下载页面
  2. 本教程所述将其链接到EBI Sparql端点

    我使用的config.json文件:

     { "name": "ebi-hgql", "schema": "ebischema.graphql", "server": { "port": 8081, "graphql": "/graphql", "graphiql": "/graphiql" }, "services": [ { "id": "ebi-sparql", "type": "SPARQLEndpointService", "url": "http://www.ebi.ac.uk/rdf/services/sparql", "graph": "http://rdf.ebi.ac.uk/dataset/go", "user": "", "password": "" } ] } 

    这是我的ebischema.graphql文件的样子(因为我只需要ClassidlabelsubClassOf ):

     type __Context { Class: _@href(iri: "http://www.w3.org/2002/07/owl#Class") id: _@href(iri: "http://www.geneontology.org/formats/oboInOwl#id") label: _@href(iri: "http://www.w3.org/2000/01/rdf-schema#label") subClassOf: _@href(iri: "http://www.w3.org/2000/01/rdf-schema#subClassOf") } type Class @service(id:"ebi-sparql") { id: [String] @service(id:"ebi-sparql") label: [String] @service(id:"ebi-sparql") subClassOf: [Class] @service(id:"ebi-sparql") } 
  3. 我开始测试一些简单的查询,但是不断得到空的响应。 这个问题的答案解决了我的问题。

  4. 最后,我构造了查询以获取树

    使用此查询:

     { Class_GET_BY_ID(uris:[ "http://purl.obolibrary.org/obo/GO_0032259", "http://purl.obolibrary.org/obo/GO_0007267"]) { id label subClassOf { id label subClassOf { id label } } } } 

    我得到了一些有趣的结果:

     { "extensions": {}, "data": { "@context": { "_type": "@type", "_id": "@id", "id": "http://www.geneontology.org/formats/oboInOwl#id", "label": "http://www.w3.org/2000/01/rdf-schema#label", "Class_GET_BY_ID": "http://hypergraphql.org/query/Class_GET_BY_ID", "subClassOf": "http://www.w3.org/2000/01/rdf-schema#subClassOf" }, "Class_GET_BY_ID": [ { "id": [ "GO:0032259" ], "label": [ "methylation" ], "subClassOf": [ { "id": [ "GO:0008152" ], "label": [ "metabolic process" ], "subClassOf": [ { "id": [ "GO:0008150" ], "label": [ "biological_process" ] } ] } ] }, { "id": [ "GO:0007267" ], "label": [ "cell-cell signaling" ], "subClassOf": [ { "id": [ "GO:0007154" ], "label": [ "cell communication" ], "subClassOf": [ { "id": [ "GO:0009987" ], "label": [ "cellular process" ] } ] }, { "id": [ "GO:0023052" ], "label": [ "signaling" ], "subClassOf": [ { "id": [ "GO:0008150" ], "label": [ "biological_process" ] } ] } ] } ] }, "errors": [] } 

编辑

这正是我想要的,但是我注意到我无法添加另一个这样的子级别:

{
  Class_GET_BY_ID(uris:[
    "http://purl.obolibrary.org/obo/GO_0032259",
    "http://purl.obolibrary.org/obo/GO_0007267"]) {
    id
    label
    subClassOf {
      id
      label
      subClassOf {
        id
        label
        subClassOf {  # <--- 4th sublevel
          id
          label
        }
      }
    }
  }
}

我创建了一个新问题: 端点返回Content-Type:text / html,SELECT查询无法识别

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM