简体   繁体   中英

Remove parent elements with certain key-value pairs using JQ

I need to remove elements from a json file based on certain key values. Here is the file I am trying to process.

{
  "element1": "Test Element 1",
  "element2": {
    "tags": "internal",
    "data": {
      "data1": "Test Data 1",
      "data2": "Test Data 2"
    }
  },
  "element3": {
    "function1": {
      "tags": [
        "new",
        "internal"
      ]
    },
    "data3": "Test Data 3",
    "data4": "Test Data 4"
  },
  "element4": {
    "function2": {
      "tags": "new"
    },
    "data5": "Test Data 5"
  }
}

I want to remove all elements that have a "tag" with value "internal". So the result should look like this:

{
  "element1": "Test Element 1",
  "element4": {
    "function2": {
      "tags": "new"
    },
    "data5": "Test Data 5"
  }
}

I tried various approaches but I just don't get it done using jq. Any ideas? Thanks.

Just to add some more complexity. Let's assume the json is:

{
  "element1": "Test Element 1",
  "element2": {
    "tags": "internal",
    "data": {
      "data1": "Test Data 1",
      "data2": "Test Data 2"
    }
  },
  "element3": {
    "function1": {
      "tags": [
        "new",
        "internal"
      ]
    },
    "data3": "Test Data 3",
    "data4": "Test Data 4"
  },
  "element4": {
    "function2": {
      "tags": "new"
    },
    "data5": "Test Data 5"
  },
  "structure1" : {
    "substructure1": {
      "element5": "Test Element 5",
      "element6": {
        "tags": "internal",
        "data6": "Test Data 6"
      }
    }
  }
}

and I want to get

{
  "element1": "Test Element 1",
  "element4": {
    "function2": {
      "tags": "new"
    },
    "data5": "Test Data 5"
  },
  "structure1" : {
    "substructure1": {
      "element5": "Test Element 5",
    }
  }
}

Not easy, finding elements which have a tags key somewhere whose value is either the string internal , or an array of which an element is the string internal in a reliable way is only possible with a complex boolean expression as below.

Once found, deleting them can be done using the del built-in.

del(.[] | first(select(recurse
  | objects
  | has("tags") and (.tags
    | . == "internal" or (
      type == "array" and index("internal")
    )
  )
)))

Online demo

The following solution is written with a helper function for clarity. The helper function uses any for efficiency and is defined so as to add a dash of generality.

To understand the solution, it will be helpful to know about with_entries and the infix // operator, both of which are explained in the jq manual.

# Does the incoming JSON value contain an object which has a .tags
# value that is equal to $value or to an array containing $value ?
def hasTag($value):
  any(.. | select(type=="object") | .tags;
      . == $value or (type == "array" and index($value)));

Assuming the top-level JSON entity is a JSON object, we can now simply write:

with_entries( select( .value | hasTag("internal") | not) )

I think I figured out how to also solve the more complex case. I am now running:

walk(if type == "object" and has("tags") and (.tags  | . == "internal" or (type == "array" and index("internal"))) then del(.) else . end) | delpaths([paths as $path | select(getpath($path) == null) | $path])

This will remove all elements that contain 'internal' as 'tag'.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM