简体   繁体   English

文本文件 (PERL) 中日志输入和输出片段的拆分字符串变量

[英]Split string variable of log input and output pieces in text file (PERL)

I have the following entry in my logfile.我的日志文件中有以下条目。

[2016-04-17 10:12:27:682011 GMT] tcp 115.239.248.245:1751 -> 192.168.0.17:8080 52976f9f34d5c286ecf70cac6fba4506 04159c6111bca4f83d7d606a617acc5d6a58328d3a631adf3795f66a5d6265f4d1ec99977a5ae8cb2f3133c9503e5086a5f2ac92be196bb0c9a9f653f9669495 (312 bytes)

I want to write a script to split this one line string into pieces in order to write some of these pieces in a .csv file for machine learning.我想编写一个脚本来将这一行字符串拆分为多个片段,以便将其中一些片段写入 .csv 文件中以进行机器学习。 Till now I got this script to find a certain pattern and if found write what it was given to find, hardcoded search.到现在为止,我得到了这个脚本来查找某个模式,如果找到,则写下它所提供的查找内容,硬编码搜索。 This is not what I want.这不是我想要的。 This is the script I have right now.这是我现在拥有的脚本。

#!/usr/bin/perl -w

$path1 = "/home/tsec/testwatch/attackerresult.log";
$attacker = ">>/home/tsec/testwatch/attacker.csv";
#$path2 =
#$path3 =
#$path4 =

#function definition #Pattern for attackerlog only
sub extractor(){
open(LOG, $path1) or die "Cant't open '$path1': $!";
open(FILE, $attacker) or die "Can't open '$attacker': $!";

$target = "tcp";

while(<LOG>){

        if(/$target/){
        print FILE $target . "\n";

        }
}
}
close(LOG);
close(FILE);

I want the output in the CSV file to be something like this:我希望 CSV 文件中的输出是这样的:

I can do the csv titles manually我可以手动做 csv 标题

(Titles)Protocol, Source IP Address, Source Port, File Size (Titles)协议、源IP地址、源端口、文件大小

(String result from script)tcp, 127.0.0.1, 8080, 312 (脚本的字符串结果)tcp, 127.0.0.1, 8080, 312

The above is just an example.以上只是一个例子。

Any idea?任何的想法?

If all lines will always have the same number of fields, this will work.如果所有行将始终具有相同数量的字段,这将起作用。

use warnings;
use strict;

open my $wfh, '>', 'out.csv' or die $!;

my $cols = "Protocol, Source IP Address, Source Port, File Size\n";
print $wfh $cols;

while (<DATA>){
    if (/
          (?:.*?\s){3}  # get rid of the time
          (.*?)         # capture the proto ($1)
          \s+           # skip the next whitespace    
          (.*?):(\d+)   # separate IP and port, capture both ($2, $3)
          .*?\(         # skip everything until an opening parens
          (\d+)         # capture bytes ($4)
        /x
       ){
        print $wfh "$1, $2, $3, $4\n";
    }
}


__DATA__
2016-04-17 10:12:27:682011 GMT tcp 115.239.248.245:1751 -> 192.168.0.17:8080 52976f9f34d5c286ecf70cac6fba4506 04159c6111bca4f83d7d606a617acc5d6a58328d3a631adf3795f66a5d6265f4d1ec99977a5ae8cb2f3133c9503e5086a5f2ac92be196bb0c9a9f653f9669495 (312 bytes)

Output file:输出文件:

Protocol, Source IP Address, Source Port, File Size
tcp, 115.239.248.245, 1751, 312

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM