简体   繁体   中英

Matching Dollar Sign in Perl String

A simple text string which contains a dollar ($) sign in Perl program:

open my $fh, "<", $fp or die "can't read open '$fp': $OS_ERROR";
  while (<$fh>)
  {
    $line=''; #Initialize the line variable
    $line=$_; #Reading a record from a text file
    print "Line is $line\n"; #Printing for confirming
    (@arr)=split('\|',$line);
    

$line gets the following pipe-separated string (confirmed by printing $line value):

Vanilla Cake $3.65 New Offering|Half pound Vanilla Cake||Cake with vanilla, cream and cheese

then split and pull that record into specific array elements:

(@arr)=split('\|',$line);

$arr[0] gets Vanilla Cake $3.65 , $arr 1 gets Half pound Vanilla Cake , $arr[2] remains empty/NULL , $arr[3] gets Cake with vanilla, cream and cheese

Now I check if $arr[0] contains a price value. Pattern to match is some text ( Vanilla Cake ), then a dollar sign($), followed by one or more digits (value of 3 in this case), decimal is optional - may be there or may not be there, and then there can be one or more digits after decimal ( .65 in this case). Using the following regex:

if ($arr[0]=~ /(.*?)(\$\d+(?:\.\d+)?)/)
{
     print "match1 is $1, match2 is $2, match3 is $3, match4 is $4\n";
}

The problem is that $1, $2, $3, $4 - all matching pattern values are printing as NULL/EMPTY. I suppose it is because of the $ sign being a part of the string $arr[0].

My guess is that because of $3.65 value, it is taking $3 part (before the decimal) as a variable and trying to substitute it and $3 is NULL. So the regex matching is happening buy value extraction may be failing because the whole string may be getting interpreted as Vanilla Cake .65 , and not as Vanilla Cake $3.65 (This is my guess)

Probably, that's why the regex matching & extraction is failing.

I also read somewhere that it may be dependent on the variable initialization ( $line or $arr[0] as single quote or double quote) - I have no clue about such a dependency (that's why included all the code like initialization of $line variable as above). $line reads one record from a file at a time, so needs to be initialized at each iteration.

Have tried solutions given in Escape a dollar sign inside a variable and Trouble escaping dollar sign in Perl , but unable to get it working. Other trial and errors on creating the regex on https://regex101.com/r/FQjcHp/2/ are also not helping.

Can someone please let me know how to get the values of Vanilla Cake and $3.65 from the above string using the right regex code?

PS: Adding a screenshot of online compiler run with same code, which works fine and captures $ value correctly. Somehow, in my program it is not picking it up. 在此处输入图像描述

This code

if ($foo =~ /(.*?)(\$\d+(?:\.\d+)?)/) {
     print "match1 is $1, match2 is $2, match3 is $3, match4 is $4\n";
}

With this input

Vanilla Cake $3.65 

Will print

Use of uninitialized value $3 in concatenation (.) or string at ...
Use of uninitialized value $4 in concatenation (.) or string at ...
match1 is Vanilla Cake , match2 is $3.65, match3 is , match4 is

The warnings will be silent if you do not have use warnings enabled.

This is what the code you have supplied does with this input. You also show that it does with your screenshot. You say, in comments, that it does not do this on your home PC. I would say that is impossible.

Either your code is different, your input is different, or your Perl installation is different (although this is unlikely the issue). There is really no alternative.

One huge problem is that you are not using use strict; use warnings use strict; use warnings with your code. That can mean that any number of problems with your code are hidden. Most likely, in your case, I would say it is a typo, such as:

$Iine = $_;
if ($line =~ /...../)  # <---- not the same variable

But you asked for 8 hours to update your code, so I guess we will find out in 8 hours.


A few pointers

  while (<$fh>)
  {
    $line=''; #Initialize the line variable
    $line=$_; #Reading a record from a text file
  • You do not need to "initialize" the line variable. The next line will make that line completely redundant.
  • That line is not actually reading a record from your file, the readline statement <$fh> is doing that.
  • Usually you would write this line as: while (my $line = <$fh>) .
  • $3 and $4 in your print statement can never hold a value, because you lack the capture groups ( ... ) necessary. Two capture groups means only $1 and $2 will be populated.

When writing Perl code, you should always use

use strict;
use warnings;

Because not doing so will not help you, it will just hide your problems.

Also make a habit of placing the declaration ( my $var ) in as small a scope as possible. Sample code:

use strict;
use warnings;
use feature 'say';

while (my $line = <DATA>) {
    my @x = split /\|/, $line;
    if ($x[0] =~ /(.*?)(\$\d+(?:\.\d+)?)/) {
        say "$1 is $2";
    }
}

__DATA__
Vanilla Cake $3.65 New Offering|Half pound Vanilla Cake||Cake with vanilla, cream and cheese

I ran into a similar problem around 2 years back - and had to break my head for more than 5 days before I could get to the root of the issue with the great $ sign. Here's how it went:

Dollar regex value was not printing - something similar to what you are observing.

The perl code written ages ago by someone had initialized the string var with double quotes. Something like

$string="This is some text";

And it worked perfectly till I touched it. :-)

What I did was inserted a variable into it, like

$string="This is some $PriceVariableHavingDollarSign text";

and then I tried to run a dollar matching regex on the $string variable with a hope to detect the dollar. Not exactly, but something very similar to what you are trying to do as follows:

$string=~ /(.*?)(\$\d+(?:\.\d+)?)/

And it either gave compilation error, or failed to pickup the dollar sign completely with the different regex combinations I tried.

So my answer-cum-suggestion is to check in your "lengthy code" if something similar is happening with double quotes on your variable. Most probably, that may be causing the problem.

Before taking in the value at the source, if possible try to use \ on the $ sign, like (at least that solved my problem). Instead of

PriceVariableHavingDollarSign = "Cake is $3.5";

try having

$PriceVariableHavingDollarSign ="Cake is \$3.5";

Here is a great explanation of what happens with double quotes and single quotes in Perl. https://www.effectiveperlprogramming.com/2012/01/understand-the-order-of-operations-in-double-quoted-contexts/

And good job for the explicit details you've put in the question, comments and graphic. It helps you to get all possible angles, scenarios as well as solutions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM