简体   繁体   English

AWS Kinesis Firehose:如何使用 aws cli 和 bash 放置包含 JSON 的多个文件

[英]AWS Kinesis Firehose: how to put multiple files containing JSONs using aws cli & bash

I have >100 files where each line is a json.我有 >100 个文件,其中每一行都是一个 json。 It looks something like this (no commas & no []):它看起来像这样(没有逗号和没有 []):

{"one":"one","two":{"tree":...}}
{"one":"one","two":{"tree":...}}
...
{"one":"one","two":{"tree":...}}

To be able to use aws firehose put-record-batch, file needs to be in the format:为了能够使用 aws firehose put-record-batch,文件需要采用以下格式:

[
  {
    "Data": blob
  },
  {
    "Data": blob
  },
  ...
]

I want to put all of these files to aws Firehose from terminal.我想将所有这些文件从终端放到 aws Firehose 中。

I'm looking to write a shell script that looks something like this:我正在寻找编写一个如下所示的 shell 脚本:

for file in files
do
  aws firehose put-record-batch --delivery-stream-name <name> --records file://$file
done

So there're 2 questions:所以有2个问题:

  1. How to transform the files into the applicable format如何将文件转换为适用的格式
  2. And, how to iterate through all the files以及,如何遍历所有文件
for file in *.json;
do
    jq -s . "${file}" >${file}.tmp && mv ${file}.tmp $file    
done

This will read all the json file in the current directory and change it into the desired form and save to the file .这将读取当前目录中的所有 json 文件并将其更改为所需的格式并保存到文件中。

OR if you do not have jq here is alternate way using python's json module.或者,如果您没有jq这里是使用python's json 模块的替代方法。

for file in *.json;do
  while read line ; do 
      echo $line | python -m json.tool 
  done < ${file} |awk 'BEGIN{print "["}{print}END{print "]"}'
done

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM