简体   繁体   中英

Get json values using awk or jq

My json files looks like this.

i have more than 5000 files: filename: xxxx.json

example file 1000.json

[
  {
    "gender": {
      "value": "Female"
    },
    "age": 38.58685,
    "age_group": "adult"
  },
  {
    "gender": {
      "value": "Male"
    },
    "age": 26.64953,
    "age_group": "adult"
  }
]

example file 2000.json

[
  {
    "gender": {
      "value": "Male"
    },
    "age": 63.8272,
    "age_group": "adult"
  },
  {
    "gender": {
      "value": "Male"
    },
    "age": 11.8287,
    "age_group": "child"
  }
]

Desired Output in one file output.txt

1000 & Female,Male & 38,26 & adult,adult
2000 & Male,Male & 63,11 & adult,child

It's doable in jq with some string interpolation:

$ find . -name "*.json" -exec jq -r \
  '(input_filename | gsub("^\\./|\\.json$";"")) as $fname |
   (map(.gender.value) | unique | join(",")) as $genders |
   (map(.age|floor|tostring) | join(",")) as $ages |
   (map(.age_group) | unique | join(",")) as $age_groups |
   "\($fname) & \($genders) & \($ages) & \($age_groups)"' '{}' +
1000 & Female,Male & 38,26 & adult
2000 & Male & 63,11 & adult,child

The input_filename command returns the obvious, and for the other parts, just grab the needed fields from . as an array and join them into CSV strings (Using join instead of @csv to avoid added quotes).


The find stuff is to avoid the possibility of just jq -r '...' *.json being too long a command line, since you said you have more than 5000 files. It runs jq potentially multiple times, with as many filenames as possible each time (The trailing + instead of ; makes -exec work a lot like xargs ), instead of running it once per file, for the sake of efficiency.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM