I want to pull only the numbers out of a file and organized as CSV.
From:
Aa:40, Bint:02 : Bstring = 0x13 Ccc Num = 52 Dfloat = 164.0
Aa:40, Bint:03 : Bstring = 0x1B Ccc Num = 10 Dfloat = 10.6
Aa:41, Bint:04 : Bstring = 0x1A Ccc Num = 10 Dfloat = 1.6
to:
40,02,0x13,52,164.0
40,03,0x1B,10,10.6
41,04,0x1A,10,1.6
I can do this with Python re.findall
(shown below)
for line in sys.stdin:
print (",".join(re.findall(r'\d+.?\w+', line)))
What would be the perl way to achieve the same?
You are extracting from your strings numeric values.
The way you can do this is with:
m/(\d+)/g;
Of course, since you're also including .
and x
:
m/(\d[\d\.xA-F]+)/ig;
Or as a one liner:
perl -nle 'print join ",", m/(\d[\d\.xA-F]+)/ig;'
n
is "wrap this in while ( <> ) {
.
This means you can pipe STDIN
or specify a file after it - eg perl -nle 'print join ",", m/(\\d[\\d\\.xA-F]+)/gi;' somefile
perl -nle 'print join ",", m/(\\d[\\d\\.xA-F]+)/gi;' somefile
cat somefile | perl -nle 'print join ",", m/(\\d[\\d\\.xA-F]+)/gi;'
cat somefile | perl -nle 'print join ",", m/(\\d[\\d\\.xA-F]+)/gi;'
l
is auto-chomp. It chomps
linefeeds and re-adds them after a print
e
is execute this snippet.
Which effectively makes the above one liner:
BEGIN { $/ = "\n"; $\ = "\n"; }
LINE: while (defined($_ = <ARGV>)) {
chomp $_;
print join(',', /(\d[\d\.xA-F]+)/gi);
}
This gives;
40,02,0x13,52,164.0
40,03,0x1,10,10.6
41,04,0x1,10,1.6
Which looks like your desired output.
foo.pl
- direct translation of python snippet
print join (',', m/(\d+.?\w+)/g), "\n" foreach <STDIN>;
The important thing to notice is the usage of /g
when looking for matches. This flag will effectivelly say that we are interested in every match present in the string, and not just the first.
Of course, the one-liner (that you specifically asked for) can be written as the below, and it might be a little bit more readable to the untrained eye:
foreach my $line (<STDIN>) {
my @data = $line =~ m/(\d+.?\w+)/g);
print join (',', @data), "\n";
}
%
Aa:40, Bint:02 : Bstring = 0x13 Ccc Num = 52 Dfloat = 164.0
Aa:40, Bint:03 : Bstring = 0x1B Ccc Num = 10 Dfloat = 10.6
Aa:41, Bint:04 : Bstring = 0x1A Ccc Num = 10 Dfloat = 1.6
%
40,02,0x13,52,164.0
40,03,0x1B,10,10.6
41,04,0x1A,10,1.6
Try something like this:
# Declare the regex
my $is_num = qr {
(?: 0x[0-9a-fA-F]+ ) # Match stuff like 0x1B
| # Or
\d+ (?: \.\d+ )? # 5 or 5.2
}x;
chomp(my @data = <DATA>);
for(@data){
my @new;
push @new, $1 while /($is_num)/g;
$_ = join ",", @new;
}
print "$_\n" for @data;
__DATA__
Aa:40, Bint:02 : Bstring = 0x13 Ccc Num = 52 Dfloat = 164.0
Aa:40, Bint:03 : Bstring = 0x1B Ccc Num = 10 Dfloat = 10.6
Aa:41, Bint:04 : Bstring = 0x1A Ccc Num = 10 Dfloat = 1.6
40,02,0x13,52,164.0
40,03,0x1,10,10.6
41,04,0x1,10,1.6
I'm sure there are better ways to do it though. Thats the first that came to my mind
# Declare the regex
my $is_num = qr {
(?: 0x[0-9a-fA-F]+ ) # Match stuff like 0x1B
| # Or
\d+ (?: \.\d+ )? # 5 or 5.2
}x;
chomp(my @data = <DATA>);
for(@data){
s/.*? ($is_num)/$1,/xg;
s/\W+$//x;
}
print "$_\n" for @data;
40,02,0x13,52,164.0
40,03,0x1B,10,10.6
41,04,0x1A,10,1.6
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.