如何从 GCE 实例中的 Container-optimized OS 获取启动脚本日志？

Question

I'm running a container-optimized compute instance with this startup-script:我正在使用此启动脚本运行容器优化的计算实例：

#!/bin/bash

mkdir /home/my-app
cd /home/my-app
export HOME=/home/my-app

docker-credential-gcr configure-docker


docker run --rm --log-driver=gcplogs --name my-app --security-opt seccomp=./config.json gcr.io/my-project/my-app:latest

The --log-driver and --name flags are set according to GCP community guide and docker docs . --log-driver和--name标志是根据GCP 社区指南和docker 文档设置的。

Yet I see no logs from the container boot up.但是我没有看到容器启动的日志。

Also, when I'm SSHing into the instance and running command logger "hello from logger" , I don't see it showing up in cloud logger.此外，当我通过 SSH 连接到实例并运行命令logger "hello from logger"时，我没有看到它出现在云记录器中。 I've tried converting it to advanced filters and removing all filtering except "hello from logger" string filter.我试过将它转换为高级过滤器并删除除“hello from logger”字符串过滤器之外的所有过滤器。

How do I properly setup the logging?如何正确设置日志记录？ I'm using bunyan inside my NodeJS app, but when it the app fails I have absolutely no visibility?我在我的 NodeJS 应用程序中使用bunyan ，但是当应用程序失败时我完全没有可见性？ I'd love to have all the logs from journalctl in cloud logger.我很想在云记录器中拥有来自journalctl的所有日志。 Or, at least the startup-script part of journalctl .或者，至少是journalctl的startup-script部分。 Right now I'm retrieving them by SSHing into the instance and running journalctl -r | grep startup-script现在我正在通过 SSH 连接到实例并运行journalctl -r | grep startup-script来检索它们。 journalctl -r | grep startup-script . journalctl -r | grep startup-script 。

Update更新

Access scopes are correctly set:正确设置访问范围：

Stackdriver Logging API: Write Only
Stackdriver Logging API: Write Only

I'm using a default compute engine service account.我使用的是默认计算引擎服务帐户。 Here the command that I'm creating this VM with:这是我创建此 VM 的命令：

gcloud compute instance-templates create $APP_ID-template \
    --scopes=bigquery,default,compute-rw,storage-rw \
    --image-project=cos-cloud \
    --image-family=cos-77-lts \
    --machine-type=e2-medium \
    --metadata-from-file=startup-script=./start.sh \
    --tags=http-server,https-server

gcloud compute instance-groups managed create $APP_ID-group \
    --size=1 \
    --template=$APP_ID-template

Startup-script:启动脚本：

#!/bin/bash

mkdir /home/startDir
cd /home/startDir
export HOME=/home/startDir

docker-credential-gcr configure-docker

docker run --log-driver=gcplogs --name my-app --security-opt seccomp=./config.json gcr.io/project-id/app:latest

This VM running a NodeJS script.此 VM 运行 NodeJS 脚本。 I'm not providing JSON keys to my NodeJS script.我没有向我的 NodeJS 脚本提供 JSON 密钥。 The bunyan logger is correctly sending logs to the cloud logger. bunyan记录器正确地将日志发送到云记录器。 It only fails to send logs when server completely crashes.它仅在服务器完全崩溃时才发送日志失败。

Logging API is enabled.记录 API 已启用。 I'm getting this:我明白了：

● stackdriver-logging.service - Fluentd container for Stackdriver Logging
   Loaded: loaded (/usr/lib/systemd/system/stackdriver-logging.service; static; vendor preset: disabled)
   Active: inactive (dead)

When running sudo systemctl status stackdriver-logging command in a VM在 VM 中运行sudo systemctl status stackdriver-logging命令时

Answer 1

Google Compute Engine Container-Optimize OS has Operations Logging (formerly Stackdriver) enabled by default. Google Compute Engine Container-Optimize OS 默认启用操作日志记录（以前称为 Stackdriver）。

In my list of problems and solutions, Problem #3 is the most common in my experience.在我的问题和解决方案列表中，问题 #3 是我经验中最常见的。

Possible Problem #1:可能的问题#1：

By default, new instances have the following scopes enabled:默认情况下，新实例启用以下范围：

Stackdriver Logging API: Write Only Stackdriver 日志记录 API：只写
Stackdriver Monitoring API: Write Only Stackdriver 监控 API：只写

If you have modified the instance's Access Scopes, make sure that the Stackdriver scopes are enabled.如果您修改了实例的访问范围，请确保 Stackdriver 范围已启用。 This requires stopping the instance to modify scopes.这需要停止实例以修改范围。

Possible Problem #2:可能的问题#2：

If you are using a custom service account for this instance, make sure the service account has at least the role roles/logging.logWriter .如果您为此实例使用自定义服务帐户，请确保该服务帐户至少具有角色roles/logging.logWriter 。 Without this role or similar, the logger will fail.没有这个角色或类似角色，记录器将失败。

Possible Problem #3:可能的问题#3：

A common problem is the Project Owner did not enable the `Cloud Logging API". Without enabling this API, the instance logger will fail.一个常见的问题是项目所有者没有启用“云日志 API”。如果不启用此 API，实例记录器将失败。

To verify if the logger within the instance is failing, SSH into the instance and execute this command:要验证实例中的记录器是否发生故障，请将 SSH 输入实例并执行此命令：

sudo systemctl status stackdriver-logging

If you see error messages related to the logging API, then enable the Cloud Logging API.如果您看到与日志记录 API 相关的错误消息，请启用云日志记录 API。

Enable the Cloud Logging API via the CLI:通过 CLI 启用 Cloud Logging API：

gcloud services enable logging.googleapis.com --project=<PROJECT_ID>

Or via the Google Cloud Console:或者通过谷歌云控制台：

https://console.cloud.google.com/apis/library/logging.googleapis.com https://console.cloud.google.com/apis/library/logging.googleapis.com

Possible Problem #4:可能的问题#4：

When creating an instance via the CLI, you need to specify the following command line option otherwise the logging service will not start:通过 CLI 创建实例时，您需要指定以下命令行选项，否则日志服务将不会启动：

--metadata=google-logging-enabled=true

[UPDATE 01/22/2021] [2021 年 1 月 22 日更新]

The OP has two problems. OP有两个问题。 1) Stackdriver service was not running. 1) Stackdriver 服务未运行。 The above steps solved that problem.上面的步骤解决了这个问题。 2) The startup script section was not going to Stackdriver. 2) 启动脚本部分不会转到 Stackdriver。

The current configuration for Container OS has the log level set too low to send startup-script logs to Stackdriver. Container OS 的当前配置将日志级别设置得太低，无法将启动脚本日志发送到 Stackdriver。

The log level is set by the file /etc/stackdriver/logging.config.d/fluentd-lakitu.conf .日志级别由文件/etc/stackdriver/logging.config.d/fluentd-lakitu.conf设置。

Look for the section Collects all journal logs with priority >= warning.查找 Collects all journal logs with priority >= warning 部分。 The PRIORITY is 0 -> 4. If you add "5" and "6" to the list, then the startup-scripts are logged in Operations Logging.优先级为 0 -> 4。如果将“5”和“6”添加到列表中，则启动脚本将记录在操作日志中。

You can change the log level but this change does not persist across reboots.您可以更改日志级别，但此更改不会在重新启动后持续存在。 I have not found a solution to make changes permanent.我还没有找到使更改永久化的解决方案。

Answer 2

I'm able to see the startup-script logs in Cloud Logging using following advanced filter log:我可以使用以下高级过滤器日志在 Cloud Logging 中查看启动脚本日志：

resource.type="gce_instance"
resource.labels.instance_id="1234567890"
protoPayload.metadata.instanceMetadataDelta.addedMetadataKeys="startup-script"

As per the GCP doc to view the startup script logs you need to login to the instance and able to see that startup-script output is written to the following log files:根据 GCP 文档查看启动脚本日志，您需要登录实例并能够看到启动脚本 output 已写入以下日志文件：

CentOS and RHEL: /var/log/messages CentOS 和 RHEL：/var/log/messages
Debian: /var/log/daemon.log Debian：/var/log/daemon.log
Ubuntu: /var/log/syslog Ubuntu: /var/日志/系统日志
SLES: /var/log/messages SLES：/var/日志/消息

In order to save some time you can use this command to see the logs:为了节省一些时间，您可以使用此命令查看日志：

gcloud compute ssh instance-id --project your-project --zone us-central1-a --command="sudo journalctl -r | grep startup-script"

如何从 GCE 实例中的 Container-optimized OS 获取启动脚本日志？

问题描述

Update更新

2 个解决方案

解决方案1
6 已采纳 2021-01-18 02:50:22

解决方案2
-1 2021-01-14 17:05:34

如何从 GCE 实例中的 Container-optimized OS 获取启动脚本日志？

问题描述

Update更新

2 个解决方案

解决方案1 6 已采纳 2021-01-18 02:50:22

解决方案2 -1 2021-01-14 17:05:34

解决方案1
6 已采纳 2021-01-18 02:50:22

解决方案2
-1 2021-01-14 17:05:34