如何基于JQ中的路径列表过滤出JSON

Question

Given an arbitrary JSON input: 给定任意 JSON输入：

{  
   "id":"038020",
   "title":"Teenage Mutant Ninja Turtles: Out of the Shadows",
   "turtles":[  
      {  
         "name":"Leonardo",
         "mask":"blue"
      },
      {  
         "name":"Michelangelo",
         "mask":"orange"
      },
      {  
         "name":"Donatello",
         "mask":"purple"
      },
      {  
         "name":"Raphael",
         "mask":"red"
      }
   ],
   "summary":"The Turtles continue to live in the shadows and no one knows they were the ones who took down Shredder",
   "cast":"Megan Fox, Will Arnett, Tyler Perry",
   "director":"Dave Green"
}

And an arbitrary list of JQ paths like [".turtles[].name", ".cast", ".does.not.exist"] , or any similar format 以及任意 JQ路径列表，例如[".turtles[].name", ".cast", ".does.not.exist"]或任何类似格式

How can I create new JSON with only the information contained in the paths of the list? 如何仅使用列表路径中包含的信息创建新的JSON？ In this case the expected result would be: 在这种情况下，预期结果将是：

{  
   "turtles":[  
      {  
         "name":"Leonardo"
      },
      {  
         "name":"Michelangelo"
      },
      {  
         "name":"Donatello"
      },
      {  
         "name":"Raphael"
      }
   ],
   "cast":"Megan Fox, Will Arnett, Tyler Perry"
}

I've seen similar solutions in problems like "removing null entries" from a JSON using the walk function present in jq1.5+ , somewhat along the lines of: 我已经看到类似的解决方案，例如使用jq1.5 +中提供的walk函数从JSON中“删除null条目”的方法，大致类似：

def filter_list(input, list):
 input
 | walk(  
     if type == "object" then
       with_entries( select(.key | IN( list )))
     else
       .
     end); 

filter_list([.], [.a, .b, .c[].d])

But it should take in account the full path in the JSON somehow. 但是它应该以某种方式考虑JSON中的完整路径。

What is the best approach to solve this problem? 解决此问题的最佳方法是什么？

Answer 1

If $paths contains an array of explicit jq paths (such as [ ["turtles", 0, "name"], ["cast"]]) , the simplest approach would be to use the following filter: 如果$ paths包含一个显式jq路径数组（例如[ ["turtles", 0, "name"], ["cast"]]) ，则最简单的方法是使用以下过滤器：

. as $in
| reduce $paths[] as $p (null; setpath($p; $in | getpath($p)))

Extended path expressions 扩展路径表达式

In order to be able to handle extended path expressions such as ["turtles", [], "name"], where [] is intended to range over the indices of the turtles array, we shall define the following helper function: 为了能够处理扩展路径表达式，例如[“ turtles”，[]，“ name”]，其中[]的目的是在turtles数组的索引范围内，我们将定义以下辅助函数：

def xpath($ary):
  . as $in
  | if ($ary|length) == 0 then null
    else $ary[0] as $k
    | if $k == []
      then range(0;length) as $i | $in[$i] | xpath($ary[1:]) | [$i] + .
      else .[$k] | xpath($ary[1:]) | [$k] + . 
      end
    end ;

For the sake of exposition, let us also define: 为了说明起见，让我们还定义：

def paths($ary): $ary[] as $path | xpath($path);

Then with the given input, the expression: 然后使用给定的输入，表达式：

. as $in
| reduce paths([ ["turtles", [], "name"], ["cast"]]) as $p 
    (null; setpath($p; $in | getpath($p)) )

produces the output shown below. 产生如下所示的输出。

Using `path` 使用`path`

It is worth point out that one way to handle expressions such as ".turtles[].name" would be to use the builtin filter path/1 . 值得指出的是，处理诸如“ .turtles []。name”之类的表达式的一种方法是使用内置过滤器path/1 。

For example: 例如：

# Emit a stream of paths:
def paths: path(.turtles[].name), ["cast"];

. as $in
| reduce paths as $p (null; setpath($p; $in | getpath($p)))

Output: 输出：

{
  "turtles": [
    {
      "name": "Leonardo"
    },
    {
      "name": "Michelangelo"
    },
    {
      "name": "Donatello"
    },
    {
      "name": "Raphael"
    }
  ],
  "cast": "Megan Fox, Will Arnett, Tyler Perry"
}

如何基于JQ中的路径列表过滤出JSON

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-05-31 02:01:55

Extended path expressions 扩展路径表达式

Using `path` 使用`path`

Output: 输出：

如何基于JQ中的路径列表过滤出JSON

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-05-31 02:01:55

Extended path expressions 扩展路径表达式

Using path 使用path

Output: 输出：

解决方案1
2 已采纳 2018-05-31 02:01:55

Using `path` 使用`path`