根据日期前缀拆分文件？

Question

I have this file.log我有这个文件.log

Sep 16 16:18:49 abcd 123 456
Sep 16 16:18:49 abcd 123 567
Sep 17 16:18:49 abcd 123 456
Sep 17 16:18:49 abcd 123 567

I want to split based on date partition so I get,我想根据日期分区进行拆分，所以我得到了，

Sep_16.log Sep_16.log

Sep 16 16:18:49 abcd 123 456
Sep 16 16:18:49 abcd 123 567

Sep_17.log Sep_17.log

Sep 17 16:18:49 abcd 123 456
Sep 17 16:18:49 abcd 123 567

I search in the forum, that it's supposed to be using csplit and regex ^.{6} , but the answer that I got only for the regex to be used as delimiter, which is not what I intended.我在论坛中搜索，它应该使用csplit和正则表达式^.{6} ，但我得到的答案只是将正则表达式用作分隔符，这不是我想要的。

Also, I want to split 10k rows per date partition, so the filename will be something like Sep_17_part001.log , which will then using something like prefix and suffix option.另外，我想为每个日期分区拆分 10k 行，因此文件名将类似于Sep_17_part001.log ，然后将使用前缀和后缀选项之类的东西。

Does anybody know the full command for doing this?有人知道这样做的完整命令吗？ And if I do this one time thing on one log, how can I make it to run daily, without csplit overwrite previous days?如果我在一个日志上做这一次的事情，我怎样才能让它每天运行，而不用 csplit 覆盖前几天？

Answer 1

So in the end, I decided to create a simple Python script after searching through csplit documentation and find nothing that suitable to my needs.所以最后，在搜索了csplit文档后，我决定创建一个简单的 Python 脚本，并没有找到适合我需要的东西。

Something like,就像是，

with open(args.logfile) as f:
    for line in f:
        timef = datetime.strptime(str(datetime.utcnow().year) + line[:6], '%Y%b %d').strftime('%Y%m%d')
        t_dest_path = os.path.join(date_path, timef + '-browse.log')
        with open(t_dest_path, "a") as fdest:
            fdest.write(line)

根据日期前缀拆分文件？

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-10-21 05:20:07

根据日期前缀拆分文件？

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-10-21 05:20:07

解决方案1
0 已采纳 2019-10-21 05:20:07