简体   繁体   English

根据字段查询 GCP 记录器以获取不同的日志

[英]Query GCP logger for distinct logs based on a field

Here is a sample log from my project in gcp logger这是我在 gcp 记录器中的项目的示例日志

{
  "insertId": "________",
  "jsonPayload": {
    "stacktrace": "github.com_____",
    "level": "error",
    "msg": "could not update usage",
    "caller": "pkg/_______.go:118",
    "ts": ______.______,
    "requestID": "c7taeaa23akg00e8r0tg",
    "error": "write exception: write errors: [The field 'fieldName' must be an array but is of type null in document {_id: ObjectId('objectID001')}]"
  },
  "resource": {
    "type": "cloud_run_revision",
    "labels": {
      "configuration_name": "configName",
      "service_name": "serviceName",
      "location": "us-central1",
      "project_id": "projectID",
      "revision_name": "revisionName"
    }
  },
  "timestamp": "2022-02-02T15:45:45.867386Z",
  "labels": {
    "instanceId": "____________"
  },
  "logName": "projects/_____/logs/run.googleapis.com%2Fstderr",
  "receiveTimestamp": "2022-02-02T15:45:45.967298989Z"
}

The issue is there are so many of logs with this exact content.问题是有这么多包含这些确切内容的日志。 My question is that is there query or a set of queries that can be used to retrieve one log with same jsonPayload.error .我的问题是,是否有查询或一组查询可用于检索具有相同jsonPayload.error的日志。

For example, if there are 6 logs in which 3 of them have same jsonPayload.error , what I need to achieve is to get 4 logs where the duplicate logs will get cancelled out and only one among them will be at the output along with the other 3 different logs.例如,如果有 6 个日志,其中 3 个具有相同的jsonPayload.error ,我需要实现的是获取 4 个日志,其中重复日志将被取消,其中只有一个将在 output 以及其他 3 种不同的日志。

Interesting question.有趣的问题。

Google's Logging query language is a filtering mechanism. Google 的Logging 查询语言是一种过滤机制。 Applying a filter reduces the number of entries returned but it does not permit formatting the results to transform the entries.应用过滤器会减少返回的条目数,但不允许对结果进行格式化以转换条目。

To transform the results you're gonna to need a bigger boat ... I recommend you consider using Google's Cloud SDK command-line tools aka gcloud .要转换结果,您将需要一艘更大的船……我建议您考虑使用 Google 的Cloud SDK 命令行工具,即gcloud

Using this you can filter logs using the queries that you've developed using Log Viewer:使用它,您可以使用您使用 Log Viewer 开发的查询过滤日志:

gcloud logging read "${FILTER}" \
--project=${PROJECT}

And (!) you can transform ( --format ) the results:并且(!)您可以转换( --format结果:

gcloud logging read "${FILTER}" \
--format="${FORMAT}" \
--project=${PROJECT}"

NOTE gcloud 's formatting does not appear to include unique|distinct functions and so we'll resort to using standard linux ( sort | uniq ) commands to achieve this.注意gcloud的格式似乎不包括 unique|distinct 函数,因此我们将求助于使用标准 linux ( sort | uniq ) 命令来实现这一点。

As an example, a hopefully generic query of cloud.audit.logging operations:例如,一个对cloud.audit.logging操作的通用查询:

PROJECT="..." # Your Project ID

# You would use "logName=\"projects/${PROJECT}/logs/run.googleapis.com%2Fstderr\""
FILTER="logName=\"projects/${PROJECT}/logs/cloudaudit.googleapis.com%2Factivity\""

# You would use "value(jsonPayload.error)"
FORMAT="value(operation.producer)"

gcloud logging read  "${FILTER}" \
--project=${PROJECT} \
--format="${FORMAT}" \
--limit=50 \
 > test.log

cat test.log | sort | uniq

Yields:产量:

cloudfunctions.googleapis.com
compute.googleapis.com
container.googleapis.com
k8s.io
servicemanagement.googleapis.com
serviceusage.googleapis.com

NOTE gcloud logging read "${FILTER}" submits the filter to the platform and is run "service-side".注意gcloud logging read "${FILTER}"将过滤器提交到平台并在“服务端”运行。 The results (which may be large) are then eg --format 'ted client-side and this can be time/processor-consuming.结果(可能很大)然后是例如--format 'ted 客户端,这可能会耗费时间/处理器。 In the example above, to save repeatedly retrieving the data from the server and then piping it through sort and uniq , it's more efficient to dump the logs into a file first.在上面的示例中,为了保存从服务器重复检索数据然后通过sortuniq管道传输,首先将日志转储到文件中效率更高。 I've also used --limit to artificially restrict the number of results returned for testing purposes.我还使用了--limit来人为地限制为测试目的返回的结果数量。 You may want to use a time filter or something other constraint.您可能想要使用时间过滤器或其他约束。

Because you referenced jsonPayload , You can gcloud... --format=json(...) too in order to extract JSON-formatted logs.因为您引用jsonPayload ,所以您也可以gcloud... --format=json(...)以提取 JSON 格式的日志。 As described above, gcloud includes formatting functionality but, also as shown above, sometimes it's easiest to use general-purpose tools.如上所述, gcloud包括格式化功能,但也如上所示,有时使用通用工具是最容易的。 In this case, jq provides powerful ways to transform JSON.在这种情况下, jq提供了强大的方法来转换 JSON。

gcloud logging read  "${FILTER}" \
--project=${PROJECT} \
--format="${FORMAT}" \
--limit=50 \
> test.json

cat test.json | jq -r unique

Yields:产量:

[
  null,
  {
    "operation": {
      "producer": "cloudfunctions.googleapis.com"
    }
  },
  {
    "operation": {
      "producer": "compute.googleapis.com"
    }
  },
  {
    "operation": {
      "producer": "container.googleapis.com"
    }
  },
  {
    "operation": {
      "producer": "k8s.io"
    }
  },
  {
    "operation": {
      "producer": "servicemanagement.googleapis.com"
    }
  },
  {
    "operation": {
      "producer": "serviceusage.googleapis.com"
    }
  }
]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM