简体   繁体   English

归档具有不同名称的每小时文件

[英]Archiving hourly files with different names

I have a log file which looks like this: 我有一个如下所示的日志文件:

2018/10/08 12:15:04 David access denied
2018/10/08 12:15:05 David access denied
2018/10/08 12:15:05 David access granted
2018/10/08 13:15:14 Karel Jan access granted
2018/10/08 13:15:19 Lydia access denied
2018/10/08 13:15:20 Lydia access denied
2018/10/08 13:15:21 Lydia access granted
2018/10/08 14:15:26 Henk access denied
2018/10/08 14:15:26 Henk access denied
2018/10/08 14:15:27 Henk access denied

Script: 脚本:

file="log.txt"
while read -r regel
do
        sort | awk '{file=$1 substr($2,1,2); gsub(/[^0-9]/,"",file) }
                {print > ("logfile_" file ".txt")}'
        zip logfile_20181008.zip logfile_20181008{00..23}.txt   
done < "$file"

This is what I got so far with help, also getting the errors below: 这是到目前为止我所获得的帮助,同时也得到以下错误:

ArchiveerLog.sh: line 6: syntax error near unexpected token `('
ArchiveerLog.sh: line 6: `  {print > (prefix bestand".txt")}'

I have hourly logfiles, and want to zip them for each day so the zip would be called logfile_20181008.zip as above, is there a way to NOT hardcore this? 我有每小时的日志文件,并且想每天压缩它们,因此如上所述,该压缩文件将被称为logfile_20181008.zip,有没有办法对此不加硬?

If you have an unsorted log-file with a log-key of this type, you are a bit in an uneasy position. 如果您有一个未排序的日志文件,并且带有这种类型的日志键,那么您的处境会有些麻烦。 Sortable date-time formats are of the form YYYY[c]MM[c]DD[c]hh[c]mm[c]ss[.sss] and are always represented in the same Time zone. 可排序的日期时间格式的格式为YYYY[c]MM[c]DD[c]hh[c]mm[c]ss[.sss] ,并且始终在同一时区中表示。 The format you present is not directly sortable by means of a simple ascii ordering. 您提供的格式不能通过简单的ascii排序直接排序。 As a simple example. 作为一个简单的例子。 The keys "01/01/2018.00:00:00" < "01/10/2018.00:00:00" < "10/10/1302.00:00:00". 键“ 01/01 / 2018.00:00:00” <“ 01/10 / 2018.00:00:00” <“ 10/10 / 1302.00:00:00”。

Using the tool sort you can set up a complicated sorting structure: 使用工具sort您可以建立一个复杂的排序结构:

sort -k1.7n,1.10 -k1.4n,1.5 -k1.1n,1.2 -k1.11 <logfile>

This will sort your file correctly. 这样可以正确排序文件。 Now you can pipe this into to do the splitting by hour : 现在,您可以将其传送到以按小时进行拆分:

sort -k1.7n,1.10 -k1.4n,1.5 -k1.1n,1.2 -k1.11 <logfile> \
    | awk -v prefix="logfile_" '{file=substr($1,1,13); gsub(/[^0-9]/,"",file) }
                                {print > (prefix file".txt")}'

This will sort the file and move all lines to the files logfile_DDMMYYYYhh.txt 这将对文件进行排序并将所有行移动到文件logfile_DDMMYYYYhh.txt

Update : After the question has been updated! 更新 :问题更新后!

sort log.txt \
 | awk '{file=$1 substr($2,1,2); gsub(/[^0-9]/,"",file) }
        {print > ("bestand_" file ".txt")}'

Second update: so your entire script can now be written as: 第二次更新:因此您的整个脚本现在可以编写为:

#!/usr/bin/env bash

#######################################################
# THIS IS NOT TESTED BUT SHOULD BE UPDATED WHERE NEEDED
#######################################################

# This is your input
logfile="log.txt"

# create a temporary directory where to do all the work
tmpdir=$(mktemp -d)

# get the full path of logfile
logfile=$(readlink -f "$logfile")

cd "$tmpdir" || exit
sort "$logfile" | awk '{file=$1 substr($2,1,2); gsub(/[^0-9]/,"",file) }
                       {print > ("logfile_" file ".txt")}'

# Now you have a $tmpdir with lots of files
# perform the zipping
oldstring=""
newstring=""
for files in ./*; do
   #remove last 6 characters ("hh.txt")
   newstring="${files%[0-9][0-9].txt}"
   [[ "${newstring}" == "${oldstring}" ]] && continue
   zip "$(dirname $logfile)/${newstring}.zip" ${newstring}[0-9][0-9].txt
   oldstring="${newstring}"
done

# uncomment this part only if you are sure it works
# # cleanup
# cd $(dirname $logfile)
# rm -rf "$tmpdir"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM