簡體   English   中英

從具有多行記錄的日志文件中提取數據到 CSV

[英]Extracting data from a log file with multi-line records to CSV

我有一個搜索算法,可以解析日志文件並將結果放入以下格式:

[Mon May  2 13:46:00 2016]Local/ESSBASE///139969058175296/Info(4052237)
Logging out user [accelatisro@Native Directory], active for 0 minutes
--
[Mon May  2 13:46:00 2016]Local/ESSBASE///139969068702016/Info(4052237)
Logging out user [accelatisro@Native Directory], active for 4 minutes
--
[Mon May  2 13:46:01 2016]Local/ESSBASE///139969078176064/Info(4052237)
Logging out user [accelatisro@Native Directory], active for 6 minutes
--
[Mon May  2 13:46:01 2016]Local/ESSBASE///69062385984/Info(4052237)
Logging out user [accelatisro@Native Directory], active for 45 minutes
--
[Mon May  2 13:46:01 2016]Local/ESSBASE///69160071488/Info(4052237)
Logging out user [accelatisro@Native Directory], active for 3 minutes
--
[Mon May  2 13:46:02 2016]Local/ESSBASE///969053964608/Info(4052237)
Logging out user [accelatisro@Native Directory], active for 3 minutes

我需要獲取日期(IE:5-2-2016 13:46:02)、注銷的用戶(IE:accelatisro@Native Directory),以及他們活躍了多少分鍾(IE:45)。 然后我需要將結果寫入逗號分隔的格式,以便我可以將信息上傳到數據庫(IE:5-2-2016 13:46:02,accelatisro@Native Directory,45)。 該文件大約有 45,000 行長,因此手動完成是不可能的。

我應該采取什么方法來解決這個問題?

簡單的方法是為您可能需要匹配的每一行編寫一個正則表達式,然后遍歷文件,從每個匹配的行中填充數據,並在您看到記錄分隔符時發出該數據。 例如:

#!/bin/bash

l1_re='^\[([^\]+)]'
l2_re='Logging out user \[([^\]+)], active for ([[:digit:]]+) minutes'
delim='--'

flush() {
  [[ $time && $user && $minutes ]] || return
  printf '%s,%s,%s\n' "${time//,/}" "${user//,/}" "${minutes//,/}"
  time=; user=; minutes=
}

while IFS= read -r line; do
  if [[ $line =~ $l1_re ]]; then
    time=${BASH_REMATCH[1]}
  elif [[ $line =~ $l2_re ]]; then
    user=${BASH_REMATCH[1]}
    minutes=${BASH_REMATCH[2]}
  elif [[ $line = $delim ]]; then
    flush
  fi
done
flush

根據您給定的輸入,這會發出:

Mon May  2 13:46:00 2016,accelatisro@Native Directory,0
Mon May  2 13:46:00 2016,accelatisro@Native Directory,4
Mon May  2 13:46:01 2016,accelatisro@Native Directory,6
Mon May  2 13:46:01 2016,accelatisro@Native Directory,45
Mon May  2 13:46:01 2016,accelatisro@Native Directory,3
Mon May  2 13:46:02 2016,accelatisro@Native Directory,3

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM