简体   繁体   English

如何将文本文件拆分为多个文件并从行前缀提取文件名?

[英]How to split text file into multiple files and extract filename from line prefix?

I have a simple log file with content like: 我有一个简单的日志文件,其内容如下:

1504007980.039:{"key":"valueA"}
1504007990.359:{"key":"valueB", "key2": "valueC"}
...

That I'd like to output to multiple files that each have as content the JSON part that comes after the timestamp. 我想输出到多个文件,每个文件都包含时间戳记之后的JSON部分。 So I would get as a result the files: 因此,我将得到以下文件:

1504007980039.json
1504007990359.json
...

This is similar to How to split one text file into multiple *.txt files? 这类似于如何将一个文本文件拆分为多个* .txt文件? but the name of the file should be extracted from each line (and remove an extra dot), and not generated via an index 但文件名应从每一行中提取(并删除多余的点),而不是通过索引生成

Preferably I'd want a one-liner that can be executed in bash. 最好是我希望可以在bash中执行的单行代码。

Since you aren't using GNU awk you need to close output files as you go to avoid the "too many open files" error. 由于您没有使用GNU awk,因此您需要在关闭输出文件时避免出现“打开文件过多”错误。 To avoid that and issues around specific values in your JSON and issues related to undefined behavior during output redirection, this is what you need: 为了避免这种情况以及与JSON中特定值有关的问题以及与输出重定向期间未定义行为有关的问题,这是您需要的:

awk '{
    fname = $0
    sub(/\./,"",fname)
    sub(/:.*/,".json",fname)
    sub(/[^:]+:/,"")
    print >> fname
    close(fname)
}' file

You can of course squeeze it onto 1 line if you see some benefit to that: 如果您看到一些好处,当然可以将其压缩到1行:

awk '{f=$0;sub(/\./,"",f);sub(/:.*/,".json",f);sub(/[^:]+:/,"");print>>f;close(f)}' file

awk solution: awk解决方案:

awk '{ idx=index($0,":"); fn=substr($0,1,idx-1)".json"; sub(/\./,"",fn); 
       print substr($0,idx+1) > fn; close(fn) }' input.log 
  • idx=index($0,":") - capturing index of the 1st : idx=index($0,":") -捕获第一个索引:

  • fn=substr($0,1,idx-1)".json" - preparing filename fn=substr($0,1,idx-1)".json" -准备文件名


Viewing results (for 2 sample lines from the question): 查看结果(针对问题中的2条示例行):

for f in *.json; do echo "$f"; cat "$f"; echo; done

The output ( filename -> content ): 输出( 文件名 -> content ):

1504007980039.json
{"key":"valueA"}

1504007990359.json
{"key":"valueB"}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM