[英]In Logstash, how do I limit the depth of JSON properties in my logs that are turned into Index fields in Elasticsearch?
I'm fairly new to the Elastic Stack.我是 Elastic Stack 的新手。 I'm using Logstash 6.4.0 to load JSON log data from Filebeat 6.4.0 into Elasticsearch 6.4.0.. I'm finding that I'm getting way too many JSON properties converted into fields once I start using Kibana 6.4.0.
我正在使用 Logstash 6.4.0 将 JSON 日志数据从 Filebeat 6.4.0 加载到 Elasticsearch 6.4.0.. 我发现一旦我开始使用 Kibana 6.4.0,我就会将太多 JSON 属性转换为字段.
I know this because when I navigate to Kibana Discover and put in my index of logstash-*
, I'm getting an error message that states:我知道这一点是因为当我导航到 Kibana Discover 并输入我的
logstash-*
索引时,我收到一条错误消息,指出:
Discover: Trying to retrieve too many docvalue_fields.
发现:试图检索太多 docvalue_fields。 Must be less than or equal to: [100] but was [106].
必须小于或等于:[100] 但为 [106]。 This limit can be set by changing the [index.max_docvalue_fields_search] index level setting.
可以通过更改 [index.max_docvalue_fields_search] 索引级别设置来设置此限制。
If I navigate to Management > Kibana > Index Patterns
I see that I have 940 fields.如果我导航到
Management > Kibana > Index Patterns
我会看到我有 940 个字段。 It appears that each child property of my root JSON object (and many of those child properties have JSON objects as values, and so on) is automatically being parsed and used to create fields in my Elasticsearch logstash-*
index.看起来我的根 JSON 对象的每个子属性(以及许多这些子属性都将 JSON 对象作为值,等等)正在自动解析并用于在我的 Elasticsearch
logstash-*
索引中创建字段。
So here's my question – how can I limit this automatic creation?所以这是我的问题——我怎样才能限制这种自动创建? Is it possible to do this by property depth?
是否可以通过属性深度来做到这一点? Is it possible to do this some other way?
有没有可能以其他方式做到这一点?
Here is my Filebeat configuration (minus the comments):这是我的 Filebeat 配置(减去注释):
filebeat.inputs:
- type: log
enabled: true
paths:
- d:/clients/company-here/rpms/logs/rpmsdev/*.json
json.keys_under_root: true
json.add_error_key: true
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.template.settings:
index.number_of_shards: 3
setup.kibana:
output.logstash:
hosts: ["localhost:5044"]
Here is my current Logstash pipeline configuration:这是我当前的 Logstash 管道配置:
input {
beats {
port => "5044"
}
}
filter {
date {
match => [ "@timestamp" , "ISO8601"]
}
}
output {
stdout {
#codec => rubydebug
}
elasticsearch {
hosts => [ "localhost:9200" ]
}
}
Here is an example of a single log message that I am shipping (one row of my log file) – note that the JSON is completely dynamic and can change depending on what's being logged:这是我要发送的单个日志消息的示例(我的日志文件的一行)——请注意,JSON 是完全动态的,并且可以根据记录的内容而变化:
{
"@timestamp": "2018-09-06T14:29:32.128",
"level": "ERROR",
"logger": "RPMS.WebAPI.Filters.LogExceptionAttribute",
"message": "Log Exception: RPMS.WebAPI.Entities.LogAction",
"eventProperties": {
"logAction": {
"logActionId": 26268916,
"performedByUserId": "b36778be-6181-4b69-a0fe-e3a975ddcdd7",
"performedByUserName": "test.sga.danny@domain.net",
"performedByFullName": "Mike Manley",
"controller": "RpmsToMainframeOperations",
"action": "UpdateStoreItemPricing",
"actionDescription": "Exception while updating store item pricing for store item with storeItemId: 146926. An error occurred while sending the request. InnerException: Unable to connect to the remote server InnerException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 10.1.1.133:8800",
"url": "http://localhost:49399/api/RpmsToMainframeOperations/UpdateStoreItemPricing/146926",
"verb": "PUT",
"statusCode": 500,
"status": "Internal Server Error - Exception",
"request": {
"itemId": 648,
"storeId": 13,
"storeItemId": 146926,
"changeType": "price",
"book": "C",
"srpCode": "",
"multi": 0,
"price": "1.27",
"percent": 40,
"keepPercent": false,
"keepSrp": false
},
"response": {
"exception": {
"ClassName": "System.Net.Http.HttpRequestException",
"Message": "An error occurred while sending the request.",
"Data": null,
"InnerException": {
"ClassName": "System.Net.WebException",
"Message": "Unable to connect to the remote server",
"Data": null,
"InnerException": {
"NativeErrorCode": 10060,
"ClassName": "System.Net.Sockets.SocketException",
"Message": "A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond",
"Data": null,
"InnerException": null,
"HelpURL": null,
"StackTraceString": " at System.Net.Sockets.Socket.InternalEndConnect(IAsyncResult asyncResult)\r\n at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)\r\n at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Exception& exception)",
"RemoteStackTraceString": null,
"RemoteStackIndex": 0,
"ExceptionMethod": "8\nInternalEndConnect\nSystem, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089\nSystem.Net.Sockets.Socket\nVoid InternalEndConnect(System.IAsyncResult)",
"HResult": -2147467259,
"Source": "System",
"WatsonBuckets": null
},
"HelpURL": null,
"StackTraceString": " at System.Net.HttpWebRequest.EndGetRequestStream(IAsyncResult asyncResult, TransportContext& context)\r\n at System.Net.Http.HttpClientHandler.GetRequestStreamCallback(IAsyncResult ar)",
"RemoteStackTraceString": null,
"RemoteStackIndex": 0,
"ExceptionMethod": "8\nEndGetRequestStream\nSystem, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089\nSystem.Net.HttpWebRequest\nSystem.IO.Stream EndGetRequestStream(System.IAsyncResult, System.Net.TransportContext ByRef)",
"HResult": -2146233079,
"Source": "System",
"WatsonBuckets": null
},
"HelpURL": null,
"StackTraceString": " at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()\r\n at RPMS.WebAPI.Infrastructure.RpmsToMainframe.RpmsToMainframeOperationsManager.<PerformOperationInternalAsync>d__14.MoveNext() in D:\\Century\\Clients\\PigglyWiggly\\RPMS\\PWADC.RPMS\\RPMSDEV\\RPMS.WebAPI\\Infrastructure\\RpmsToMainframe\\RpmsToMainframeOperationsManager.cs:line 114\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()\r\n at RPMS.WebAPI.Infrastructure.RpmsToMainframe.RpmsToMainframeOperationsManager.<PerformOperationAsync>d__13.MoveNext() in D:\\Century\\Clients\\PigglyWiggly\\RPMS\\PWADC.RPMS\\RPMSDEV\\RPMS.WebAPI\\Infrastructure\\RpmsToMainframe\\RpmsToMainframeOperationsManager.cs:line 96\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()\r\n at RPMS.WebAPI.Controllers.RpmsToMainframe.RpmsToMainframeOperationsController.<UpdateStoreItemPricing>d__43.MoveNext() in D:\\Century\\Clients\\PigglyWiggly\\RPMS\\PWADC.RPMS\\RPMSDEV\\RPMS.WebAPI\\Controllers\\RpmsToMainframe\\RpmsToMainframeOperationsController.cs:line 537\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Threading.Tasks.TaskHelpersExtensions.<CastToObject>d__1`1.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Controllers.ApiControllerActionInvoker.<InvokeActionAsyncCore>d__1.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Filters.ActionFilterAttribute.<CallOnActionExecutedAsync>d__6.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Web.Http.Filters.ActionFilterAttribute.<CallOnActionExecutedAsync>d__6.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Filters.ActionFilterAttribute.<ExecuteActionFilterAsyncCore>d__5.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Filters.ActionFilterAttribute.<CallOnActionExecutedAsync>d__6.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Web.Http.Filters.ActionFilterAttribute.<CallOnActionExecutedAsync>d__6.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Filters.ActionFilterAttribute.<ExecuteActionFilterAsyncCore>d__5.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Controllers.ActionFilterResult.<ExecuteAsync>d__5.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Filters.AuthorizationFilterAttribute.<ExecuteAuthorizationFilterAsyncCore>d__3.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Controllers.AuthenticationFilterResult.<ExecuteAsync>d__5.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Controllers.ExceptionFilterResult.<ExecuteAsync>d__6.MoveNext()",
"RemoteStackTraceString": null,
"RemoteStackIndex": 0,
"ExceptionMethod": "8\nThrowForNonSuccess\nmscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089\nSystem.Runtime.CompilerServices.TaskAwaiter\nVoid ThrowForNonSuccess(System.Threading.Tasks.Task)",
"HResult": -2146233088,
"Source": "mscorlib",
"WatsonBuckets": null,
"SafeSerializationManager": {
"m_serializedStates": [{
}]
},
"CLR_SafeSerializationManager_RealType": "System.Net.Http.HttpRequestException, System.Net.Http, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a"
}
},
"performedAt": "2018-09-06T14:29:32.1195316-05:00"
}
},
"logAction": "RPMS.WebAPI.Entities.LogAction"
}
I never ultimately found a way to limit the depth of the automatic field creation.我最终没有找到一种方法来限制自动创建字段的深度。 I also posted my question in the Elastic forums and never got an answer.
我还在Elastic 论坛上发布了我的问题,但从未得到答案。 Between the time of my post and now, I have learned a lot more about Logstash.
从我发帖到现在,我对 Logstash 有了更多的了解。
My ultimate solution was to extract the JSON properties that I needed as fields and then I used the GREEDYDATA
pattern In a grok
filter to place the rest of the properties into an unextractedJson
field so that I could still query for values within that field in Elasticsearch.我的最终解决方案是提取JSON属性我需要作为字段,然后我用
GREEDYDATA
图案在grok
滤波器应用于所述属性的其余部分放入一个unextractedJson
字段使得我仍然可以查询在Elasticsearch该字段内的值。
Here is my new Filebeat configuration (minus the comments):这是我的新 Filebeat 配置(减去注释):
filebeat.inputs:
- type: log
enabled: true
paths:
- d:/clients/company-here/rpms/logs/rpmsdev/*.json
#json.keys_under_root: true
json.add_error_key: true
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.template.settings:
index.number_of_shards: 3
setup.kibana:
output.logstash:
hosts: ["localhost:5044"]
Note that I commented out the json.keys_under_root
setting which tells Filebeat to place the JSON formatted log entry into a json
field that is sent on to Logstash.请注意,我注释掉了
json.keys_under_root
设置,该设置告诉 Filebeat 将 JSON 格式的日志条目放入发送到 Logstash 的json
字段中。
Here is a snippet of my new Logstash pipeline configuration:这是我的新 Logstash 管道配置的片段:
#...
filter {
###########################################################################
# common date time extraction
date {
match => ["[json][time]", "ISO8601"]
remove_field => ["[json][time]"]
}
###########################################################################
# configuration for the actions log
if [source] =~ /actionsCurrent.json/ {
if ("" in [json][eventProperties][logAction][performedByUserName]) {
mutate {
add_field => {
"performedByUserName" => "%{[json][eventProperties][logAction][performedByUserName]}"
"performedByFullName" => "%{[json][eventProperties][logAction][performedByFullName]}"
}
remove_field => [
"[json][eventProperties][logAction][performedByUserName]",
"[json][eventProperties][logAction][performedByFullName]"]
}
}
mutate {
add_field => {
"logFile" => "actions"
"logger" => "%{[json][logger]}"
"level" => "%{[json][level]}"
"performedAt" => "%{[json][eventProperties][logAction][performedAt]}"
"verb" => "%{[json][eventProperties][logAction][verb]}"
"url" => "%{[json][eventProperties][logAction][url]}"
"controller" => "%{[json][eventProperties][logAction][controller]}"
"action" => "%{[json][eventProperties][logAction][action]}"
"actionDescription" => "%{[json][eventProperties][logAction][actionDescription]}"
"statusCode" => "%{[json][eventProperties][logAction][statusCode]}"
"status" => "%{[json][eventProperties][logAction][status]}"
}
remove_field => [
"[json][logger]",
"[json][level]",
"[json][eventProperties][logAction][performedAt]",
"[json][eventProperties][logAction][verb]",
"[json][eventProperties][logAction][url]",
"[json][eventProperties][logAction][controller]",
"[json][eventProperties][logAction][action]",
"[json][eventProperties][logAction][actionDescription]",
"[json][eventProperties][logAction][statusCode]",
"[json][eventProperties][logAction][status]",
"[json][logAction]",
"[json][message]"
]
}
mutate {
convert => {
"statusCode" => "integer"
}
}
grok {
match => { "json" => "%{GREEDYDATA:unextractedJson}" }
remove_field => ["json"]
}
}
# ...
Note the add_field
configuration options in the mutate
commands that extract the properties into named fields followed by the remove_field
configuration options that removes those properties from the JSON.请注意将属性提取到命名字段中的
mutate
命令中的add_field
配置选项,然后是从 JSON 中删除这些属性的remove_field
配置选项。 At the end of the filter snippet, notice the grok
command that gobbles up the rest of the JSON and places it in the unextractedJson
field.在过滤器片段的末尾,请注意
grok
命令,该命令吞噬了 JSON 的其余部分并将其放置在unextractedJson
字段中。 Finally, and all importantly, I remove the json
field that was provided by Filebeat.最后,最重要的是,我删除了 Filebeat 提供的
json
字段。 That last bit saves me from exposing all that JSON data to Elasticsearch/Kibana.最后一点让我免于将所有 JSON 数据暴露给 Elasticsearch/Kibana。
This solution takes log entries that look like this:此解决方案采用如下所示的日志条目:
{ "time": "2018-09-13T13:36:45.376", "level": "DEBUG", "logger": "RPMS.WebAPI.Filters.LogActionAttribute", "message": "Log Action: RPMS.WebAPI.Entities.LogAction", "eventProperties": {"logAction": {"logActionId":26270372,"performedByUserId":"83fa1d72-fac2-4184-867e-8c2935a262e6","performedByUserName":"rpmsadmin@domain.net","performedByFullName":"Super Admin","clientIpAddress":"::1","controller":"Account","action":"Logout","actionDescription":"Logout.","url":"http://localhost:49399/api/Account/Logout","verb":"POST","statusCode":200,"status":"OK","request":null,"response":null,"performedAt":"2018-09-13T13:36:45.3707739-05:00"}}, "logAction": "RPMS.WebAPI.Entities.LogAction" }
And turns them into Elasticsearch indexes that look like this:并将它们转换为如下所示的 Elasticsearch 索引:
{
"_index": "actions-2018.09.13",
"_type": "doc",
"_id": "xvA41GUBIzzhuC5epTZG",
"_version": 1,
"_score": null,
"_source": {
"level": "DEBUG",
"tags": [
"beats_input_raw_event"
],
"@timestamp": "2018-09-13T18:36:45.376Z",
"status": "OK",
"unextractedJson": "{\"eventProperties\"=>{\"logAction\"=>{\"performedByUserId\"=>\"83fa1d72-fac2-4184-867e-8c2935a262e6\", \"logActionId\"=>26270372, \"clientIpAddress\"=>\"::1\"}}}",
"action": "Logout",
"source": "d:\\path\\actionsCurrent.json",
"actionDescription": "Logout.",
"offset": 136120,
"@version": "1",
"verb": "POST",
"statusCode": 200,
"controller": "Account",
"performedByFullName": "Super Admin",
"logger": "RPMS.WebAPI.Filters.LogActionAttribute",
"input": {
"type": "log"
},
"url": "http://localhost:49399/api/Account/Logout",
"logFile": "actions",
"host": {
"name": "Development5"
},
"prospector": {
"type": "log"
},
"performedAt": "2018-09-13T13:36:45.3707739-05:00",
"beat": {
"name": "Development5",
"hostname": "Development5",
"version": "6.4.0"
},
"performedByUserName": "rpmsadmin@domain.net"
},
"fields": {
"@timestamp": [
"2018-09-13T18:36:45.376Z"
],
"performedAt": [
"2018-09-13T18:36:45.370Z"
]
},
"sort": [
1536863805376
]
}
The depth limit can be set per index directly in elastic search.可以直接在弹性搜索中为每个索引设置深度限制。
ElascticSearch Field Mapping documentation : https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html#mapping-limit-settings ElasticSearch 字段映射文档: https ://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html#mapping-limit-settings
From the docs :从文档:
index.mapping.depth.limit
The maximum depth for a field, which is measured as the number of inner objects.index.mapping.depth.limit
的最大深度,以内部对象的数量来衡量。 For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc. Default is 20.例如,如果所有字段都定义在根对象级别,则深度为 1。如果有一个对象映射,则深度为 2,以此类推。默认为 20。
Related answer : Limiting the nested fields in Elasticsearch相关答案: 限制 Elasticsearch 中的嵌套字段
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.