how to use fluentd to parse mutliple log of kubernetes pod output

Question

I tried to implement EFK stack into our current environment with Fluentd.

There is a configuration I have is

    <source>
      ...
      path /var/log/containers/*.log
      ...
    </source>

which supposed to grab all the standard output of all pods on the worker node. But when I ssh into that node and take inspect of the format of the output, I found the standard output log with multiple line was broken into different log entries, for example:

{"log":"Error [ERR_HTTP_HEADERS_SENT]: Cannot set headers after they are sent to the client\n","stream":"stderr","time":"2021-10-29T18:26:26.011079366Z"}
{"log":"    at ServerResponse.setHeader (_http_outgoing.js:530:11)\n","stream":"stderr","time":"2021-10-29T18:26:26.011130167Z"}
{"log":"    at sendEtagResponse (/app/node_modules/next/dist/next-server/server/send-payload.js:6:12)\n","stream":"stderr","time":"2021-10-29T18:26:26.011145267Z"}
{"log":"    at sendData (/app/node_modules/next/dist/next-server/server/api-utils.js:32:479)\n","stream":"stderr","time":"2021-10-29T18:26:26.011229869Z"}
{"log":"    at ServerResponse.apiRes.send (/app/node_modules/next/dist/next-server/server/api-utils.js:6:250)\n","stream":"stderr","time":"2021-10-29T18:26:26.011242369Z"}
{"log":"    at exports.modules.3626.__webpack_exports__.default (/app/.next/server/pages/api/users/[id]/organizations.js:350:34)\n","stream":"stderr","time":"2021-10-29T18:26:26.011252769Z"}
{"log":"    at runMicrotasks (\u003canonymous\u003e)\n","stream":"stderr","time":"2021-10-29T18:26:26.011264269Z"}
{"log":"    at processTicksAndRejections (internal/process/task_queues.js:97:5)\n","stream":"stderr","time":"2021-10-29T18:26:26.011275069Z"}
{"log":"    at async apiResolver (/app/node_modules/next/dist/next-server/server/api-utils.js:8:1)\n","stream":"stderr","time":"2021-10-29T18:26:26.011284869Z"}
{"log":"    at async Server.handleApiRequest (/app/node_modules/next/dist/next-server/server/next-server.js:66:462)\n","stream":"stderr","time":"2021-10-29T18:26:26.01129647Z"}
{"log":"    at async Object.fn (/app/node_modules/next/dist/next-server/server/next-server.js:58:580) {\n","stream":"stderr","time":"2021-10-29T18:26:26.01130717Z"}
{"log":"  code: 'ERR_HTTP_HEADERS_SENT'\n","stream":"stderr","time":"2021-10-29T18:26:26.01131707Z"}
{"log":"}\n","stream":"stderr","time":"2021-10-29T18:26:26.01132747Z"}

then all of those lines are broken into separate log pieces and got transported into Elasticsearch, is there a way that we can make those mutliple lines into one single piece?

Appreciated to the help of any kinds.

Answer 1

You can use the multiline plugin to achieve that.

This provides the format_firstline parameter where you can use a regex.

You didn't share much of your regular log output, so here is an example for a timestamp with format YYYY-MM-dd HH:mm:ss,zzz

firstline: /\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2},\d{3}/

You could also try to match on the beginning of the line eg ^(Info|Error) .

That way fluentd will recognize multiple lines as one log entry.

Check docs for more info about configuring the plugin.

how to use fluentd to parse mutliple log of kubernetes pod output

Question

1 answers

solution1
0 2021-11-01 12:13:45

how to use fluentd to parse mutliple log of kubernetes pod output

Question

1 answers

solution1 0 2021-11-01 12:13:45

solution1
0 2021-11-01 12:13:45