简体   繁体   English

有没有办法使用apache nifi中的处理器从json字段提取值并将该值替换为另一个json文件?

[英]Is there a way to extract a value from a json field using a processor in apache nifi and substitute that value into another json file?

So I have created a workflow in apache nifi that pulls csv attachments from gmail and converts them to json. 所以我在apache nifi中创建了一个工作流,该工作流从gmail中提取了csv附件并将其转换为json。 What I am stuck on is extracting 3 values (clientip,Country,user-agent) from the json I currently have and replacing those values within another json that I have that will be used to run alerts on another program. 我被困的是从我当前拥有的json中提取3个值(clientip,Country,用户代理),然后在我拥有的另一个json中替换这些值,这些值将用于在另一个程序上运行警报。 I am not sure as to what processors would be used to achieve this. 我不确定要使用什么处理器来实现这一目标。 Any tips would be greatly appreciated. 任何提示将非常感谢。

I have tried playing around with extract attributes and JoltTransformJson but I can not get either of them to work as intended.. 我已经尝试过使用extract属性和JoltTransformJson,但是我无法使它们中的任何一个都能正常工作。

First json that I get from converting csv file: 我从转换csv文件获得的第一个json:

{
  "clientip" : "116.255.157.126",
  "Country" : "China",
  "host" : "teachinglaw-prod.uis.georgetown.edu",
  "useragent" : "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)",
  "uri" : "//Config_Shell.php"
}

Second json I wrote that needs to have ("data","message","data") values updated with first json: 我写的第二个json需要使用第一个json更新(“数据”,“消息”,“数据”)值:

    {
"title": "cat7-SQL Injection",
"description": "alert description",
"type": "Internal ",
"source": "Splunk ",
"sourceRef": "Splunk alert ",
"severity": 2,
"tlp": 2,
"artifacts": [{
"dataType": "ip",
"data": "176.121.14.180",
"message": "Belize",
"tags": ["SQL Injection"]
},
{
"dataType": "user - agent",
"data": "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US) AppleWebKit/525.19 (KHTML, like Gecko) Chrome/1.0.154.53 Safari/525.19",
"tags": ["SQL Injection"]
}
],
"caseTemplate": "SQL Injection"
}

I need to find a processor(s) that will give me this result after values have been merged/substituted : 我需要找到一个处理器,将值合并/替换后会给我这个结果:

    {
"title": "cat7-SQL Injection",
"description": "alert description",
"type": "Internal ",
"source": "Splunk ",
"sourceRef": "Splunk alert ",
"severity": 2,
"tlp": 2,
"artifacts": [{
"dataType": "ip",
"data": "116.255.157.126",
"message": "China",
"tags": ["SQL Injection"]
},
{
"dataType": "user - agent",
"data": "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)",
"tags": ["SQL Injection"]
}
],
"caseTemplate": "SQL Injection"
}

I would suggest using the EvaluateJSONPath processor to extract the desired JSON values to flowfile attributes, and then route to ReplaceText and use Expression Language to replace template tokens with the attribute values. 我建议使用EvaluateJSONPath处理器将所需的JSON值提取到流文件属性,然后路由到ReplaceText并使用表达式语言用属性值替换模板标记。 For example, given this "input JSON": 例如,给定此“输入JSON”:

{
  "clientip" : "116.255.157.126",
  "Country" : "China",
  "host" : "teachinglaw-prod.uis.georgetown.edu",
  "useragent" : "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)",
  "uri" : "//Config_Shell.php"
}

Your EvaluateJsonPath processor should have the following configuration (any property not listed is kept as the default, and the last 3 are "Dynamic Properties" added with the " + " button on the top right of the table): 您的EvaluateJsonPath处理器应具有以下配置(未列出的任何属性均保留为默认属性,后三个属性是“动态属性”,并在表右上方添加了“ + ”按钮):

  • Destination : flowfile-attribute 目的地flowfile-attribute
  • ip : $.clientip ip$.clientip
  • message : $.Country 讯息$.Country
  • user_agent : $.useragent user_agent$.useragent

The next processor is a ReplaceText processor with the following configuration: 下一个处理器是具有以下配置的ReplaceText处理器:

  • Search Value : (?s)(^.*$) 搜索值 :( (?s)(^.*$)
  • Replacement Value : {...JSON template below...} 替换值{...JSON template below...}

The JSON template is the following: JSON模板如下:

{
    "title": "cat7-SQL Injection",
    "description": "alert description",
    "type": "Internal ",
    "source": "Splunk ",
    "sourceRef": "Splunk alert ",
    "severity": 2,
    "tlp": 2,
    "artifacts": [{
            "dataType": "ip",
            "data": "<template_ip>",
            "message": "<template_message>",
            "tags": ["SQL Injection"]
        },
        {
            "dataType": "user - agent",
            "data": "<template_user_agent>",
            "tags": ["SQL Injection"]
        }
    ],
    "caseTemplate": "SQL Injection"
}

Each value of the form <template_some_value> will get matched and the some_value piece will get extracted as the capture group referenced by $2 . 形式<template_some_value>每个值都将匹配,并且some_value将被提取为$2引用的捕获组。

Finally, another ReplaceText is used to inject the attribute value into each template token location. 最后,另一个ReplaceText用于将属性值注入每个模板令牌位置。 The configuration is: 配置为:

  • Search Value : <(template_(\\w+))> 搜索值<(template_(\\w+))>
  • Replacement Value : ${${'$2'}} 重置价值${${'$2'}}

The final output will look like: 最终输出如下所示:

插入值后流文件内容的屏幕快照

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM